Wikipedia talk:Version System sketch

From Wikipedia, the free encyclopedia

  • I'm curious to see whose proposal you shall bravely and unselfishly defend. I personally like the status quo just fine. Fernando Rizo T/C 02:34, 2 August 2005 (UTC)
    • Oh, yeah, well, the status quo's perfect, absolutely, for a much smaller site. I don't want to take away the deliberation, btw, but... Oh, you'll see. Besides, I'm just the stuntman: these aren't my ideas; I'm only here to take the fall for them. Geogre 14:33, 2 August 2005 (UTC)
    • I read all of that just to get a cliff-hanger ending. Just when my dog was starting to scratch out his first characters in Agammemnon's script. Fernando Rizo T/C 18:50, 2 August 2005 (UTC)
    • Hey! My dog didn't learn to communicate with the Mycenaeans in a day, so don't expect miracles! Enough is now in place to start dismissing it as useless. But it's going to get even more useless shortly, so stay tuned. Geogre 19:34, 2 August 2005 (UTC)

Contents

[edit] My concern

Geogre, "your" proposed system is elegant and would certainly address many of the major concerns about VfD quite capably. My problem with it is best illustrated with that most cherished element of rhetoric, the hypothetical situation. I intend, like Kurosawa's Rashōmon, to present you with two parallel yet slightly different situations. (that was just too much damn work - FR)

Okay, look. Right now, any given stranger has some control over the content of an article. If someone makes a bad-faith edit, I can almost instantaneously recognize this and revert it. If you give a stranger the ability to make a bad faith subjective judgement of an article, I can do nothing. If a kid wants to wander around through the Wikipedia voting '5' on every article or voting '1' on every stub, how do we stop that? Granted, one kid is not a problem, but a a whole Counter-strike clan of kids is. Or the GNAA. A clan's worth of kids is easy to discern on VfD: figure out who signed up when and who's got too few edits. With the new system, this is much more difficult. Fernando Rizo T/C 20:15, 2 August 2005 (UTC)

I absolutely agree. That's why quorum has to be in place. Even then, it's the weakest part, IMO. Geogre 21:51, 2 August 2005 (UTC)
Relative merit: As much as army votes are a problem with this system, they are a much more serious threat with "category-based systems" and others that invite edit-warring. This system's quorum is not a panacea, but it's at least a defense. Also, this system does not abolish VfD or CSD, does not expand nor restrict admin powers. It addresses scale, however, and democratizes VfD. Geogre 16:16, 5 August 2005 (UTC)

[edit] Some comments

  1. This looks like Kuro5hin's comment rating system.
  2. The ratings look reversed (lower is good and higher is bad is counterintuitive).

--cesarb 01:33, 6 August 2005 (UTC)

[edit] Don't like it at all

  • Seems to require more effort than what we have now, not less.
  • Invites vandalism. I think we can be confident that George W. Bush would reach the rating required for speedy deletion in no time.
  • As mentioned above the lower is better system is the wrong way round.

Osomec 21:22, 7 August 2005 (UTC)

[edit] Great idea, but...

What a wonderful idea for Wikipedia - an evaluation system (And what glorious prose to discribe it - credit goes to Geogre). However, this proposal is going to need some serious reworking if this will ever work.

  • Speedy deletion articles can't last a month! Suppose anon makes an article, gives it a handful of votes, and leaves it there. There's a good chance that it will be spotted by someone who knows what they're doing, but maybe a few "5" votes will go through. Speedy deletion is a critical piece of Wikipedia. Speedy deletion will have to exist with this new system, or else we'll quickly get over two million articles with articles without merit.
  • Too few admins. There aren't enough admins to police Wikipedia (.14%??). VfD works because a group of (mostly) non-biased deletionists, with no previous knowledge, do some research into the topic and discuss conclusions. In this new system, admins will now become watchdogs, not servants of the VfD forum, which will take a lot of strain to try to keep the status quo.
  • Not my article? No featured status for you! Bias is abound. I can see many articles that are very well done being shot down because of trolls/POV pushers. I know the current system keeps them down too, but this might make it worse.
  • Hmm... yeasterday it was a 2, but today's a 3! Articles change in quality quickly. Suppose for a few months an article hovers around 4, but it will take a great deal of time to get it up after a major revision because the older votes will still be in the vote count - a 2 article might still have a 4 rating. Plus, you can't vote on each and every edit - that just couldn't function. Also, vandalism could drive scores down (as people only see that revision), even when most vandalism is taken down fairly quickly.
  • Bringing more power into the hands of the unexperienced, taking from the experienced. The vast majority of users are those with little experience with Wikipedia. Therefore, most of these votes will be coming from those who haven't worked on - say - VfD, and don't really know the policy. They could read the policy, of course, but 1) it's hard to find, and 2) it's as big as War and Peace. We are taking power of judging articles away from the users who know how Wikipedia works and what should be included here.

We're talking about a lot of system strain. VfD isn't the best solution (why else would so many people be trying to fix it?). However, I think that making it more effective might be the best solution in the near future.

One important thing I think is important in making such a system work is accountability. The edit history is the only real reason why this project has worked so well. I think if such a rating system was put into place, two things would be important:

  • Being able to see the rating each user gave to each article at what time for what reason.
  • Giving more power to more experienced users (in edits, but that's such a poor way of measuring experience). It seems to be the best defence we have, even though abuses of power will occur (and does occur now).

Hey, if this system could work, I'd love it. I'm eager to see how everyone else reacts to the idea. -JJLeahy 03:03, 8 August 2005 (UTC)

I said I wouldn't argue for it, and I won't. I will point out, though, that this particular system leaves speedy delete in place. It also allows for admin-based overwriting of score tallies. Given that Score 1's will be rare, I know that I could cruise that system in no time and see newly created/promoted 1.0's that, in fact, have been fished up. Anyway, nothing of the current system is removed by this system. It leaves VfD in place, CSD in place, and leaves admins with the ability to do all that they presently do.
That said, you are correct, of course, that votes need to be history-enabled, as I foresee using consistent (consistent!) mis-voting (unquestionably mistaken) as being capable of being used as evidence in an Request for Comment on users or Arbitration. The idea is that anyone gaming the system would at least be prosecutable for doing so. The decision as to whether something is consistent or misvoting or consistent misvoting would be up to ArbCom folks.
Anyway, I'm just the stunt man for this idea, and I tried to punch it a few times to see where its weak spots are, so that's why I wrote it up. Also, it takes off the load of scaling up VfD without abolishing VfD in favor of the demotic "category system."
Also, to people above saying this sounds like X or Y's idea and that it's reversed etc.
First, it's an idea that goes way, way back in Wikiyears. What's mine is the detailing.
Second, the "reversed scoring" is actually to be understood as 1.0, 0.8, 0.5, 0.2, and 0.0. I.e. what's behind the "reversed" scoring is that what's really happening is determining Wikipedia 1.0 -- the print version of Wikipedia that continues to be rumored, the one that would be for sale or something. So Featured Articles would be rated at 1.0 meaning "they make it in Wikipedia 1.0." That's why the golf-like priority, but it's really not that important which way the ordinals point to me.
Anyway, I'll go back to circling from orbit now. Geogre 17:20, 8 August 2005 (UTC)

[edit] Too easily gamed

As a developer I can say this would be pathetically easy to game by the use of bots. The gains to be made from gaming, and the losses to be made from not counter-gaming, are so high that I predict that, if we ever adopted this, we'd be spending lots of time in fire-fights just dealing with the sheer volume of bot work on Wikipedia. I myself would be tempted to write a watch bot which would spot deletion candidates in my list of important articles and carefully feed in positive votes at a rate just fast enough to save them without being detected.

There really is nothing wrong with VfD as it stands. It brings human judgement to bear on a problem that requires it, and it encourages and rewards lightning-fast research and cleanup. In fact, it's the most efficient cleanup forum we have. And it's relatively hard to game. And the best thing about VfD is that it's self-limiting. As it fills up, people moan about it not working any more and stop listing stuff for deletion. Perfect. --Tony SidawayTalk 17:53, 8 August 2005 (UTC)

Unless the system also included some measure of editors' edit contributions (further increasing complexity) it wouldn't work, for being too easily gamed. And if it did include something like that, edit contributions would themselves be subject to gaming to increase scores - and that's probably an insoluble problem (or else far worse than the disease). Rd232 23:14, 8 August 2005 (UTC)

I truly wish folks would note what I said about 4 places: This does not eliminate VfD. Seriously. I'm not kidding. It doesn't. It doesn't eliminate CSD, VFD, or anything else. All it does is create a new method for automatic nomination. That's all. Automatic nomination. The reason is that it catches up with the scaling issue. Further, it allows folks to have a quality assessment short of VfD and the others. The main advantage is that it solves the issue of "scaling," which will kill VfD and is killing VfD, and it creates multiple "versions" of the encyclopedia. I don't mean to be snippy, but let me say it again: This does not eliminate VfD. Instead it scales VfD. As for gaming, it cannot be gamed more easily than the present system and, in fact, is much less easy to game. Note the requirement of Quorum. (Honestly, I addressed this!) Why do I not do more policy? Because no one reads the policy proposals, when all the want to do is say the first objection that comes to mind. Finally, with all due respect, the Version System came from developers and, according to Tim anyway, is more easily implemented in 1.5 than it was before. Geogre 02:35, 9 August 2005 (UTC)

VfD scales itself. Eventually people tire of listing stuff and stop. --Tony SidawayTalk 02:37, 9 August 2005 (UTC)

And that's the goal, is it? To stop things from being deleted, even when they violate policy? Geogre 15:54, 9 August 2005 (UTC)
I think the point being made may have been that articles were listed at a manageable rate, not 400 at a time. In that way the existing system is better than the proposed. Mark Lewis 23:30, 10 August 2005 (UTC)
Not sure what Geogre is saying above. The self-regulation of VfD by the reduction in the number of nominations as it reaches saturation, if it happens, won't be a violation of policy. And I'm sure that it will happen. --Tony SidawayTalk 19:45, 16 August 2005 (UTC)

[edit] Hmm, nice try...

Compared to some of the efforts, that's a wild, no holds barred compliment if you hadn't noticed! One thing, if we use the scoring system you propose, each day anywhere between 0 and 1000 articles may get proposed for deletion or featured-ising - particularly new articles may be unpredictable in their scores, due to the low base of votes from editors. In order to reduce the load and variability of those automatically put onto VfD or FAC, why not simply have the highest ranked or lowest ranked 4 articles (for example) proposed? Then, there is a guaranteed, manageable number of articles each day/week. Of course, the disadvantage of this is that otherwise good or poor articles may be overlooked since they are simply not poor enough for the that particular day or week.

Note: interlining responses to prevent text blobbing:
The most important safeguard in the proposal is the Quorum section. No article will be automatically promoted to a deliberation page until after a month has passed and a least 10 scores have been registered. It's important that this be "and" and not "or." This prevents a mass flood of 1.0's and 5.0's. Also, some people who dislike article deletion will, I am sure spend their time looking for things that carry 4 scores and try to fix them. I'm sure that people like Kappa and Tony would much rather put in the hard work to make the article no longer fit a deletion guideline than just try to break the deletion process. Hopefully, in the full month that would be required, people would be improving the 4.0 score articles and lowering their aggregate value, so we would have a far more effective method of Wikipedia:Cleanup than we presently do, because we'd spot not only new articles that are deletion candidates (before they're listed), but we'd see when old articles had been savaged and POV'd to the point that their average score was now falling to VfD-range. Geogre 16:09, 9 August 2005 (UTC)
Now, some people see the real goal to make VfD cease to delete. To them, that's when it "works." It "works" when almost everything is kept. The automatic promotion to deliberation removes, I think, a great deal of the intimidation that is now thrown at VfD voters. Also, it allows for the standards of speedy delete and VfD to develop dynamically by actual consensus, and not by someone's declaration that anything short of 85% explicit "delete and nothing else" is a decision. Geogre 16:09, 9 August 2005 (UTC)

Two more things; If an article's ranking changes before a decision is made on VfD or FAC, will it automatically be removed from the page, even if voting/discussion has already begun? Secondly, will editors still be able to propose regardless articles for either of the two pages? If not, why not, that is instruction creep and limiting the power of your average Joe Editor. If yes, what is the point of the whole new system when it can be bypassed! (Note: I'm not trying to be inflammatory, negative or depressing, simply trying to see the argument from both extremes and to prompt constructive ideas/reactions!) Mark Lewis 11:58, 9 August 2005 (UTC)

Once nominated to VfD, it is nominated to VfD, so a rewrite change while on VfD wouldn't remove it from there. Regular editors would be unaffected in their ability to nominate pages for VfD or CSD. This is designed to be 1) an additional mechanism that addresses issues of scale and avoids Tony's solution of people giving up and no longer listing articles 2) a way to develop "versions" of Wikipedia that will free us from a lot of content debates 3) a way to provide evidence of actual community opinion aside from whoever can stand to wade through 150+ VfD listings 4) a way to allow for some dynamism in deletion guidelines.
Yes, it can be bypassed. The point is not to replace VfD, but rather to address the issues of scaling. If, in the future, it were adapted so that there were a score that is like 6 ("If it doesn't get attention, the Star Chamber needs to make it go away") or where there were actual execution based on score, that would be for the future. I.e. the Version System sort of needs to be in place, on trial, and addressing the one issue it can before any of us should trust it with deletion/FA. Further, the system doesn't remove the abilities admins have now: that of speedy deletion based upon their own community-sanctioned judgments, the ability to "close" a VfD with 8 delete votes, 4 BJAODN votes, and 1 merge vote as "keep and do nothing to it," etc. It won't get rid, I'm afraid, of the good and bad parts of administrative powers, and the actual answer for those will still be with RfC, RfAr, etc.
Finally, sorry if I'm being snippy, but there is a rather thick subtext that you're certainly not part of, and I apologize for the tone. It's why I kept swearing not to get involved in the talk page, the debates, etc. I figure I can offer the proposal, but I'm too impassioned to be neutral in the debates. Geogre 16:09, 9 August 2005 (UTC)

I do see all sorts of issues, but maybe it's like Wikipedia itself - it doesn't work in theory but it sort of does in practice. How difficult would be to implement the system and try it on a limited subset of articles? Rd232 22:14, 9 August 2005 (UTC)

Good idea: try it on articles created (only new articles) for the month of September or October. Since the quorum feature requires a month for nomination, the score/tag would conceivably work only for a month but decisions about whether it works or not not be made until after a month after the end of the trial. (I.e. we have Sept. articles carry the score system, but those won't get promoted/demoted by score until Oct., so we ought not vote on whether it works or not until beginning of November. These months are hypothetical.) Also, if we really want a trial, we could introduce it on one of the smaller wiki's (none of the living language ones, but the Latin or Romulan or whatever ones). Geogre 15:03, 10 August 2005 (UTC)
Okay, thanks Geogre, and I appreciate that you've been reasonable with me and that I am not, and certainly don't want to be, part of this 'thick subtext' that goes back ages (you mentioned above). I realise you are being something of a sacrificial lamb by simply representing the idea! One last thing, if I may (seems to be the only issue raised that hasn't been completely addressed). Previous votes on the quality of the article may be totally irrelevant to the article as it currently stands - the article may have vastly improved or been ruined since then. However, this older vote (accurate as it may have been) has equal value in the scoring system as one which is bang up to date, having been made for the current revision. Even though more recent votes will eventually bring that score into line, it will take time and until then it could be wrong!
The obvious solution to me is that older votes have gradually less weight compared to newer ones. I say 'newer' and 'older' indicating that weight falls as time passes. This could be wrong. Perhaps, votes carry less weight as more edits separate that vote and the present. Note that this does not discriminate between the experience/standing of the editor in any way. Thoughts?

By the way, there is a trial (such as it is) featured on one of the signpost articles this week on a test wiki, though it is unweildy I feel as it stands. Mark Lewis 15:20, 10 August 2005 (UTC)


Well, it's one of the hard parts. What I had written into the proposal to deal with that was that a complete rewrite or a major revision would reset the score to 3.0 (but never to 2 or 4 or whatever). The reason that it's hard is that I can't at present figure out a good threshold for the reset. Is that something that we'd allow only admins to trigger? Is it something that any editor/voter could request on the talk page, so at least there is a permanent record that "this article had a reset request after the revision of August 9, 2005 (diff at [[diff]])." Either of those would work, for me. However, I absolutely think that a rewrite ought to reset all the votes back to 3.0 and restarts the clock on evaluations leading to automatic nomination, but I felt that either admin-only resets or request-resets would be problematic. What I don't think can be possible is "any non-minor edit (one not marked with the tick box of "This is a minor edit" in the save field) resets." Geogre 19:58, 10 August 2005 (UTC)

Okay, thanks, have found the relevant point on "The fix (is not in)", though a little more on the specifics is going to be needed to convince people. For example, we will need guidelines (as will admins if they are to be responsible) as to what exactly constitutes a 'major rewrite', and the process must require sufficient reasoning upon the suggestion that prevent new editors/vandals put in hundreds of random requests.

Secondly, (working my way down the article slowly!) I don't see why a talk page must be created when creating the article. Unless there are technical barriers, I was visualing a simple rating system inside the left column of the page, or on a new column on the right hand side. This makes it easier to see the ranking of the article by readers, and simpler to vote. To clear up how to vote, there could be a voting help or rating help link to the relevant page (as exists already for editing). Two more advantages that I can think of - server load could be reduced as far fewer talk pages will have to be created - just one central page which can be altered easily if and when the voting system changes. Secondly, talk pages which already exist will not have to be erased or moved, also reducing the difficulty of application.

Lastly, can one IP only vote once, or once per edit, or as many times as he/she wishes? Ta, Mark Lewis 23:27, 10 August 2005 (UTC)

[edit] A good starting point

Automated scoring would be a huge help in turning Wikipedia into the reputable instrument I would like it to be. I'm entranced by the idea of a Professional Wikipedia. But I'm unclear as to how it will reduce the load on VfD. If, as you say, VfD will not go away, that means that in addition to the articles being placed on VfD by editors, there will now be those automatically placed on the agenda.

I support the idea of quorum. I have felt for some time that there needs to be a suitable combination of votes and time before consensus can be clearly said to have been met. While it may not be immune to Tony's votebot, it is also true that the intersection between those who are fervent inclusionists and those who can program a bot is quite small.

Thanks for your hard work, Geogre!

Denni 03:08, 2005 August 11 (UTC)

Thanks, Denni! You're right: it wouldn't cut down on VfD load at all. In fact, it would increase it. Really, though, Kim (and others) keep saying that VFD doesn't scale. Well, it doesn't. Ok. But, of course, if it does scale, it gets huge. Now, some of these people want a "blanking" solution (horrible idea, IMO), others want a kind of modding solution (otherwise known, to me anyway, as "Everything2"). To me, neither of these does a blinkin' thing to reduce VfD load, either. They abolish the VfD page so that people don't argue with each other any more, but I fail to see how that's any improvement whatever. Increasing blind action seems like a bad thing. As for whether or not this version system can someday lessen VfD load, it can, but only if there were scores that executed administrative decisions and did replace VfD. To me, that's possible, but I would want to see the system in place, working, and relatively bullet proof before I'd ever vote for it. Folks should understand that we're never going to do anything if we let the perfect be the enemy of the good, and, in fact, things getting worse and worse is precisely what some people want, as it brings the community ever-closer to wanting to scap deletion altogether (which is the stated and implicit aim of several now). To me, the issue has never been deletion or inclusion, but standards. If we will not have standards for admission (and we don't), and if standards for inclusion inevitably mobilize armies to oppose them, then the least we can do is have a standard that applies externally to the thing to be a caveat lector. Geogre 16:09, 11 August 2005 (UTC)

[edit] Have rating system with a different deletion system

I like the idea of rating Wikipedia articles--if the rating was based on reliability of an article and not tied in with possible deletion or featured article status. To tie a rating in with deletion or Fa status will, as others have said on this page, invite gaming. For an example of what can happen in this regard, check out sites like Zoetrope, where benefits (such as recognition and possible publication) can result from a high rating. This system has created large numbers of cliches that help bump up the ratings of their friends' stories and screenplays.

I still believe Wikipedia should have a rating system. The catch, though, is that there should be no benefit to an article or editor of a rating except that it helps readers determine the reliability of an article.--Alabamaboy 14:55, 13 August 2005 (UTC)

Well, this really isn't supposed to be a ratings or modding system. At least my vision of it is that it won't be a reward or punishment sort of thing at all. So long as all you get is a nomination to deletion or bronzing, and nothing else, we cut down, somewhat, on the motivation to game. The armies would only be marching if they were united in killing something (or preventing the killing of something). This is already taking place with the schools articles, here, but the version system widens out the voices heard from to be everyone and anyone, and not just the people who read the page on meta that tells them to go vote "keep."
However, the bigger thing, the prevention of the gaming, is in the ability of administrators to overwrite the votes and the ability of exaggerating mis-voting being used as evidence in an otherwise lodged RfC, anyone's RFA, etc. In other words, the "I'm Hitler and I vote 5 on all Jew articles and 1 on all Nazi articles" person is going to get fwapped, first by any admin who can just override the score (and singlehandedly undo a whole army of sock puppets' careful work), and such a consistent voting record would be evidence used in determining blocking (if the person did something other).
Additionally, other mechanisms should be added, if anyone can think of them. For example, would it be possible to have a new penalty? Could we have people be non-juorors (i.e. they can read and write, but not vote)?
Personally, though, I think that we do better, rather than worse, when the only things at stake are nominations to FAC and VfD, and lessen, rather than increase, the motivations for the armies. That's just me, though. Geogre 14:11, 14 August 2005 (UTC)
What if the system to determine Featured Articles was a combination of public votes, admin votes, and public scoring, while the deletion was based on scoring alone. This would probably cut down on gaming since the main incentive for gaming, IMHO, is to try and get certain articles to FA status. I am also wary of a system of moving articles toward FA status b/c this might be a step backward in the quality of articles. As it is now, the people who vote on FA status take into account references and copyright status of images, which are just two of the important FA qualifications that most editors and readers who scored an article would probably overlook.--Alabamaboy 14:37, 14 August 2005 (UTC)

I would agree that a rating system would be a good idea. But despite your suggestions about discriminating admins, i am absolutely certain that people will tweak their score for an article bearing in mind the risk of it being deleted, or their desire for it to be. That is pretty obvious: if the result of the score is to nominate for deletion, then that is how they will choose their own rating. Maybe software or admins can catch extreme votes, but if you just bump up or down 1 or 0.5 on your scoring system, then it is just going to get lost in the normal variation between voters and articles. It would inevitably become gaming. It could only fail to do so, if there is no penalty to an article you favour despite your criticising it. Then you might give an honest vote that although you think an article needs to exist, its current contents are rubbish.Sandpiper 21:43, 20 August 2005 (UTC)

[edit] Needs a summary

66.167.137.37 11:03, 15 August 2005 (UTC): While it's nice to be warned:

If you have no patience for words, this is not for you. I'm sorry, but I like words. Words have always been my second best friends.

It would be nicer and quite useful to have a short summary of the proposal at the beginning of the proposal. The proposal is too long and too stylized for at least some of us to want to read it, even if we are interested in the topic.

Agreed. It would be nice to have either an intro or a Version System Sketch Lite to put forward the arguments as succinctly as possible. Mark Lewis 15:34, 15 August 2005 (UTC)

Well, I'd be delighted with a condensed version, but I proposed this as a reminder and a sui generis piece, and not part of the unified page. I don't mean to be recalcitrant, but one of the problems with proposals is that they end up like PowerPoint slides with 4 bullets. I agree that the writing is stylized. That was intended to make it less onerous to deal with all the words. In a sense, I suppose that I'm saying that I have made it as short as I can, and I surely don't mind anyone else making a briefer version. I own neither the idea nor the words, after all -- same as all of us -- but I would urge anyone undertaking it to realize that there were/are quite a few safeguards in the proposal really are vital to it. This isn't a modding system, and it isn't a VfD replacement, and I had to go to some length to explain why it isn't. Geogre 19:26, 16 August 2005 (UTC)