User:Rockpocket/Ref desk stats

From Wikipedia, the free encyclopedia

This page documents occasional statistical surveys of Wikipedia:Reference desk, its contributors and its impact on the encyclopaedia. Unless otherwise stated, the work here was carried out by Rockpocket (talk · contribs). Please feel free to make changes to this page if you spot a mistake, however I would prefer large scale changes in content be discussed on the talk page first. If you would like the raw data to do further analysis, drop me a note on my talk page.

Contents

[edit] Query/response analysis, October 2007

Figure 1. The rate of questions asked is linear with respect to those left unanswered.
Figure 1. The rate of questions asked is linear with respect to those left unanswered.

Inspired by a request on the desk itself, I decided to determine the numbers of questions asked at the Reference Desk and the proportion that were answered, as a function of time and subject. To do this I sampled all questions asked in a 4 week (28 day) period between October 1 and October 28, 2007. Questions that were deemed inappropriate for the board and previously removed (either because they were examples of trolling or requests for medical/legal advice) were discounted, as were questions that were duplicated across boards (these were counted in the most suitable category).

[edit] Results

In total, 1860 questions were asked during the 4 week long study period. 75 of these questions remained unanswered (4%), providing a response rate of 96%. This breaks down:

  • 465 questions asked per week, 18.8 not answered
  • 66.4 questions asked per day, 2.7 not answered
  • 2.8 questions asked per hour, 0.1 not answered

This assumes the rate of answering questions is consistent with the rate they are asked. To ascertain this I plotted (Figure 1) the accumulative number of questions asked (x axis), against the accumulative number of questions that were not answered (y axis). Indeed, the relationship between the two does not significantly deviate from linear. These data demonstrates that unanswered questions are distributed equally across the 28 day sampling period, and that - beyond the first seven days after posting - additional time does not appear to increase the probability of a question being answered. This is an artifact of the archiving/transclusion system. Questions are only "active" on the desk for seven days before archiving, thus if a question is not answered a week after posting, it is unlikely it will ever be answered.

Figure 2. Questions asked by day of the week
Figure 2. Questions asked by day of the week

Another possibility is that the query/response profile way vary by day during the week. Since the Ref Desk is a truly global site, the collective work week of the querents and responders is ill defined. For the purposes of this study, all questions were attributed to days according to Zulu time, thus the majority of questions could be attributed incorrectly by as much as -8 hours (contributors from the Pacific coast of the Americas) to +12 hours (New Zealanders). Figure 2 shows the mean number of questions asked on each weekday. The data is further sub divided to show the distribution of questions between subjects. The standard error is also shown (n=4). While there is a trend towards fewer questions being asked at weekends, the difference is not statistically significant.

[edit] Subjects

Historically, as single reference desk accommodated questions on any and every subject. In August 2005 this was split into 4 different desks, followed by others in November 2005, July 2006 and December 2006 as the service became more popular. Seven different desks are currently available covering Computing, Science, Mathematics, Humanities, Language, Entertainment and one for everything else.

Figure 3a shows the distribution of questions asked within these subjects. Humanities was the most popular subject with 421 questions (23%) closely followed by Science (414, 22%). Miscellaneous (346, 19%) and Computing (296, 16%) were next most popular, with Language (142, 8%), Entertainment (122, 6%) and Maths (119, 6%) significantly less popular. As Figure 2 demonstrates, this distribution is relatively consistent day to day, with Computing questions showing the most relative variability (everyone's computer works on a Tuesday, it would appear).

As a measure of how proficient Ref Desk volunteers are at providing answers for questions from each subject, I plotted unanswered questions in the same manner (Figure 3b). If each type of questions was answered equally well, the plots would be segmented in roughly the same proportions for each subject. This is the generally the case with the most obvious exception of Science, Language and Entertainment questions. Science and Language questions tend to be answered more frequently than average (less then 2.5% remained unanswered, compared to 4% overall). Most striking was the difference in the Entertainment questions. Off the 122 Entertainment questions asked, 15 remained unanswered (12.3%). Thus an Entertainment question is three times more likely to remain unanswered than a randomly selected question, and six times more likely than a question pertaining to Language. Possible reasons for this will be discussed in greater detail below. It is important to note, however, that while the relative proportion of unanswered questions is unusually high for Entertainment questions, the total number of unanswered Entertainment questions (15) is less than those unanswered Miscellaneous questions (17), and similar to Humanities (14) and Computing (13). This can be observed in Figure 3c, showing the mean number of questions asked by subject each day, subdivided into the amount answered (in colour) and unanswered (in black).

These data also demonstrate that the subjects segregate into three statistically significant groups based on popularity. Humanities and Science are not significantly different in terms of questions asked, but do differ from Miscellaneous and Computing (Student's t-test, P<0.05). Similarly, Miscellaneous and Computing do not differ from each other, but do differ significantly from Maths, Languages and Entertainment (Student's t-test, P<0.01). These three are not significantly different from each other.

[edit] Discussion

While the quantity of questions answered can be assessed, the quality of those answers are beyond the scope of these analyses. That would require a survey of those that asked the question - to assess whether the querent considered the answers helpful - and a third party analysis of whether the question was, in fact, answered correctly. Considering that would require research into over 1000 disparate subjects, it is unlikely to ever be considered on this size of sample. I considered a question answered if there was, what I considered to be, a good faith attempt to help the question. This amounted to a response that provided a link, information or opinion related to the question. It did not include requests for clarification of the original question, asides, additional questions without informing the first, notices the question was inappropriate or bad faith responses.

What is possible is a qualitative analysis of the questions that remained unanswered. These appear to divide into three different classes. Some questions appear to be answerable, in that - from a non-expert perspective - they do have an answer that is not outwith the normal scope of that which is provided on the desk.

The reason these questions are not answered may simply be that the individual(s) with expertise or experience in this area were not available or disinclined to answer. In a minority of cases it may be that potential respondents consider these questions of be homework (consider, for example, the computing request about the "three types of advertising sugarmama has"). It is currently against Ref desk policy to answer homework questions, though usually responders will note that and provide hints or direction rather than ignore the question (these I have considered as good faith answers for the sake of this analysis). Another reason for unanswered questions of this class may simply be due to misplacement. For example, the request for the LA Gear ticker may have been answered if placed on the Miscellaneous desk rather than the Humanities desk. Often helpful respondents will recommend the question be asked on the appropriate desk, or copy it there themselves. Another reason may be due to the phrasing of the question. Consider:

Has anyone ever setup a camera and an Anemometer and recorded the movement of the Sailing stones and the wind speed?

Due to its phrasing, it is very difficult to answer that question in the negative, as it requires complete knowledge of what has not happened, rather than an single example of knowledge about what has happened. Only an expert could answer "no" and be confident of being correct. A rephrasing of the question, for example

Is anyone aware of a camera and an Anemometer being used to record the movement of the Sailing stones and the wind speed?

may have been likely to draw a helpful, if incomplete, response. Nonetheless, it is this class of unanswered question where the Ref desk could most easily improve its record of response. This is the largest class of unanswered question. Some questions are difficult to accurately class, nevertheless this class accounts for as much as half of those unanswered on the Ref desks.

A second class of unanswered question is those that require extremely specialist knowledge beyond what is likely available from Ref Desk volunteers. While these questions can be answered, to do so would probably require access to information not available online or through generally accessible books or other references sources. Occasionally such questions are answered when, by coincidence, an expert is available to respond. However, short of recruiting more "volunteers" to ensure there is sufficient pool of expertise, it is difficult to see how to ensure this class of question is answered more often. This is the second largest class, accounting for around one third of the unanswered questions

A final class of unanswered question are essentially unanswerable either because the question lacks context or coherence, or requests information the is realistically impossible to provide. Often these were generate requests for clarification, and will occasionally draw an answer based on what the responder thinks is being asked for. This class is the smallest, accounting for around 15% of the unanswered questions. Interestingly, these appear to be enriched in the Entertainment, especially, and Miscellaneous desks, perhaps reflecting the trivial nature of some of the questions.

[edit] Conclusion

This study of the Ref Desk efficiency demonstrated an impressive 96% response rate over 1860 questions in 28 days. An analysis of those questioned not answered suggests this could increase to around 98% with the current repertoire of Ref desk responders making a concerted effort to ensure questions were not left unanswered. The additional 2% would be more challenging to address. The two subjects that show the weakest response rate is Entertainment, by some margin, followed by Miscellaneous. However, these also attract the most unanswerable questions. If these are discounted then there is little difference in answering rate by subject. Thus it can be concluded the Wikipedia Reference Desk functions remarkably well in terms of directing querents to information, though the quality of that information can not be easily evaluated. In addition, during the sample period, six encyclopaedia articles were created or significantly improved as a direct result of questions being asked and answers provided (This rounds out a total of 68 articles created or significantly improved in 10 months since records have been kept).

[edit] See also