Category Archives: Research

Questioner or answerer?

Yesterday on StackOverflow, I came across one of those users that kept asking questions, but didn’t really seem to understand much of the responses. Looking at his profile, it turned out he had asked over a hundred questions, but contributed less than ten answers. I won’t be tempted to start about his capabilities of actually answering any SO questions (although his understanding of other’s answers to his own questions, except when he was able to copy-paste someone’s source code, also didn’t seem to be that great), but it did get me thinking about what a ‘common’ ratio of questions versus answers would be for other SO users (personally, I’m at 1/85 right now). Of course, that triggered my data-analysis and graphing gene…

Continue reading

StackOverflow user diversity

I’ve been wondering what the diversity of knowledge of StackOverflow users would be like. It seemed like an interesting research idea to see how many people have responded only to questions in a very narrow field, and how many others have broader knowledge and can contribute useful answers in more diverse fields. Apparently, there is even supposed to be a badge for that (the Generalist badge), but it didn’t get implemented yet.

It’s easy to do this using tags: some sort of clustering should be applied according to how often each pair of tags shows up at the same question (a user that knows both ASP and ASP.net shouldn’t be considered a ‘diverse’ person, so this should be factored out first), next we can count in how many different clusters that this user has contributed a good answer.

Continue reading

TOP500 list by interconnect

While attending a Hot Interconnects talk on supercomputing, I got the following idea. The TOP500 site provides graphs of the number of systems and total performance per interconnect family, which shows an approximate measure of the popularity of the different interconnects. But how do they affect the performance of an individual system? Clearly, a high-performance interconnect should result in higher efficiency than a commodity one. But by how much? And which systems would use what type of interconnect?

Continue reading

Practical Compressor Test

Welcome to the ‘Practical Compressor Test’. Unlike some other compressor comparison sites, I won’t be looking for a compressor offering for the last bit of compression. Instead I’ll try to find the most practical compressor out there. This means compression and decompression times are taken into account, so PAQAR and the like, which can achieve very good compression at the expense of insanely long run times (several hours on this benchmark!) are not considered.

Instead I’ll be focusing on very well known, established compressors that are easily obtained (I only use precompiled packages and won’t build from source) and have reasonable run times. Also I won’t try every combination of compression options but limit the test to one general option (-1 to -9 for gzip and bzip2, -m1 to -m5 for RAR, …).

Continue reading