Yesterday on StackOverflow, I came across one of those users that kept asking questions, but didn’t really seem to understand much of the responses. Looking at his profile, it turned out he had asked over a hundred questions, but contributed less than ten answers. I won’t be tempted to start about his capabilities of actually answering any SO questions (although his understanding of other’s answers to his own questions, except when he was able to copy-paste someone’s source code, also didn’t seem to be that great), but it did get me thinking about what a ‘common’ ratio of questions versus answers would be for other SO users (personally, I’m at 1/85 right now). Of course, that triggered my data-analysis and graphing gene…
I’ve been wondering what the diversity of knowledge of StackOverflow users would be like. It seemed like an interesting research idea to see how many people have responded only to questions in a very narrow field, and how many others have broader knowledge and can contribute useful answers in more diverse fields. Apparently, there is even supposed to be a badge for that (the Generalist badge), but it didn’t get implemented yet.
It’s easy to do this using tags: some sort of clustering should be applied according to how often each pair of tags shows up at the same question (a user that knows both ASP and ASP.net shouldn’t be considered a ‘diverse’ person, so this should be factored out first), next we can count in how many different clusters that this user has contributed a good answer.