User talk:SirFozzie/Investigation

From Wikipedia, the free encyclopedia

Contents

[edit] Wow

I am surprised by the effort and time you have put into this. Good job, and thank you. нмŵוτнτ 01:00, 8 February 2008 (UTC)

[edit] Timing analysis

Just listing out the times of edit aren't helpful enough.

There are two things I'd like to see... One, more detailed statistical distribution analysis on time of day of edits for both accounts, (graphed, preferably), and two, going back further, is there any period where there were back-to-back edits by one then the other?

All the times you cite seem to be offset by significant amounts. While they could be an individual physically changing locations to use another account, that's not suggestive per se.

Another analysis which would help would be a timeline with edit times/dates listed per account, side by side.

Georgewilliamherbert (talk) 04:01, 8 February 2008 (UTC)

Sorry, GWH, missed this: I think I can do that with regards to editing in general (it's a huge shmuckin excel file that I'm working from). Also, CoolHandLuke has added some distribution of edits as well. SirFozzie (talk) 05:25, 8 February 2008 (UTC)

[edit] Disclaimer

A conclusion has been presented beneath my evidence. At this point I'm actually neutral. DurovaCharge! 08:41, 11 February 2008 (UTC)

[edit] Lack of control

All this evidence is meaningless without a control. In other words, the same analysis needs to be run for a random pair of editors with similar interests and timezone to establish a baseline of what it should look like.

To put it another way, we have a bunch of behaviour which looks suspicious, but we have no idea of what the probability of that behaviour occurring randomly is. Simply put, given two independent editors in the same timezone, what is the probability of an ABAB edit?

As an aside, the fact that the data contains ABA edits, but the page does not mention this, seems to show a degree of bias.

If the users are indeed sock puppets, then the fact that they have posted to each other's user pages at all shows a remarkable degree of dedication to sock puppetry.

Personally, if I had the tools handy and time to burn, I would do a word-frequency analysis on their respective talk page edits. There's quite a lot of literature on this subject. It's also much harder to change your writing style than it is to change your proxy settings. For example, I used "the fact that" twice in the last two paragraphs. --61.214.155.14 (talk) 05:41, 17 March 2008 (UTC)

Most of the evidence is on other pages. This was the most preliminary work.
There are only five examples of editing A1...B1...BX...A2 where the distance from A1 to A2 is less than 30 minutes. As it happens, I did do some comparisons. See User:SirFozzie/Investigation/Sandbox#Section 13: Significance of so few overlaps by Cool Hand Luke. See also User:Alanyst/Edit collision research#Unresolved questions. As for word frequency analysis, User:Alanyst did an amazing study here: User:Alanyst/Vector space research. Cool Hand Luke 06:15, 17 March 2008 (UTC)