Wikipedia talk:Nofollow

From Wikipedia, the free encyclopedia

Contents

[edit] Disputed survey statements

I'm troubled that the original very POV wording of the survey [1] may have affected at least one vote: "Why exactly are we doing this? It's stated right there that this action won't deter spammers". According to Wikipedia:Dispute_resolution#Conduct_a_survey, "The survey should be carefully designed to present all sides of the dispute fairly".

In the face of a sophisticated bot attack, it is disputed that:

  • ... linkspam can be quickly removed. The bot made 67 spam edits in 10 hours, and a spam URL remained in the article (PHP) for hours despite reversion, because an early reversion was not fully done and later reversions just reverted to it.
  • ... spam blacklists are effective. The bot simply switched to a new URL domain with each new attack, making filtering based on previously seen domains ineffective. User:Silsor finally took the drastic step of blacklisting entire subdomain-hosting services (6x.to and uni.cc) despite the side-effects this might have on legitimate websites hosted there.
  • ... anon IP blocking is effective. The bot simply switched to a new anon IP with each new attack, making blocking based on previously seen IPs ineffective. These were likely not open proxies but hacked zombie machines. Some of the anon IP addresses were in ranges that are problematic to block for any length of time, such as AOL addresses.
  • ... spammers will be undeterred by "nofollow". Unsophisticated spammers who add links manually might not be aware of "nofollow", but the operator of a sophisticated bot that constantly switches domains and IPs would probably be knowledgeable enough to be deterred by "nofollow" if he had known that we had implemented it only a few week earlier.

-- Curps 07:37, 12 Feb 2005 (UTC)

    • Interesting, but that's a technical mishap. Such things will always occur occasionally. If rollbacks work properly, however, spam can be removed effectively.
      • It's not a technical mishap... the rate of spam edits was so high that manual reverting became errorprone. Manual defense can't cope with a fully automated attack.
        • Maybe there needs to be a light level defense bot. We already watch the anon edits in IRC, any reason we can't develop an anti-bot and dispatch the greyhat individuals after the bot writers ... they give an IP every time, let's start visiting ISP's IRL and insisting they stick to their 'Terms of Service'. (and this is about the 'exception' not the rule.)
    • The blacklist is effective against a person or persons pushing a particular site. If, as in the case of the bot that's been plaguing us, it seems to just be spamming for the sake of spamming, the blacklist will do little because it was not designed to deal with such a scenario.
      • I'm not sure what you're saying here, you appear to admit that the blacklist is ineffective. The bot is not spamming for the sake of spamming, it's pushing a particular site or sites for financial gain... it's just using several dozen different domains and subdomains to do so.
    • The bot we're discussing is quite sophisticated; most spam, however, comes from single- or few-address users. Temporary blocks are just that, temporary; if a legitimate user is affected by one, all he or she has to do is email an administrator (a step explicitly suggested in the message shown to blocked users).
    • Perhaps. Of course, getting one's site into Wikipedia still means increased traffic and attention. Also, the bot that our discussion keeps coming back to is an exception, not the norm for link spammers.

--Slowking Man 07:53, Feb 12, 2005 (UTC)

Such bots will not be the exception in the future. As Wikipedia continues its phenomenal growth, the financial incentives for spamming it become greater. -- Curps 08:06, 12 Feb 2005 (UTC)

[edit] Yes, no longer use nofollow

  • I won't repeat the reasons I already stated in the Village Pump and the MetaWiki, but my vote is a STRONG YES. — Stevie is the man! Talk | Contrib 18:29, 5 Feb 2005 (UTC)
  • I think it should be removed, for now. While I agree Wikipedia shouldn't care one way or another about things like PageRank, using "nofollow" probably isn't going to deter link spammers; since they'll likely keep spamming regardless, and we'll have to keep reverting them regardless, why not help out good external pages by giving them a PageRank boost? If Wikipedia becomes known among link spammers as a haven which doesn't use "nofollow", or if link spamming software starts detecting "nofollow" sites and leaving them alone, then we should start using "nofollow". And switching in the future won't be hard, since it's just a single line in the config files (righ?). -- Khym Chanur 01:40, Feb 12, 2005 (UTC)

[edit] No, continue using nofollow

  • Wikipedia is a big target, and provides a big payoff for spammers when they manage to get a link hosted. According to MeatballWiki [2], Wikipedia is the single largest wiki out there (or if you count the languages as separate wikis, it's the largest three wikis and eight of the top ten). With nofollow active, linkspammers should quickly get the idea that wikis aren't a good place to target. I get really tired of having to track down incidents like last night, where a spammer created a half-dozen pages with nothing but external links. --Carnildo 05:38, 9 Feb 2005 (UTC)

Consider the Russian spambot that has been hammering PHP, Cybercash, DBpp, DBM, CCVS, FrontBase, and also Consultative Group on Indonesia, Wikipedia talk:Friends of Wikipedia/Other wikis, Thatware and their talk pages. He's got dozens if not hundreds of domains, and is probably using them as throwaways if they get on spam blocklists and registering new ones on a continual basis... as long as each domain earns more than the $6 or so it costs to register them in bulk, he can afford to keep doing that. He uses dozens if not hundreds of anonymous IPs, coming up with dozens of new ones on a daily basis — these are not open proxies but likely a herd of hacked zombie machines... a few of them came from an AOL address, for instance. The bot blindly hammers the same page a dozen times in a row even if the last edit was its own, so the history page can grow by dozens of versions every day. It's hard to see what to do about this, except permanent vprotection (really unfortunate for a major topic like PHP).

If you want a picture of the future, this is it. These are industrial-strength methods, a pure steamroller numbers game. Spamfilter blacklists, IP blocking, timely reverting... none of the traditional methods can stop this. I'm not sure what can.

Removing "nofollow" will only create an enormous new incentive for more such spambots to hammer Wikipedia continuously. This is dangerously naive.

-- Curps 00:56, 10 Feb 2005 (UTC)

You have a good point, but I highly doubt that having nofollow active has deterred or will deter spammers. People such as our new Russian friends who run vandalbots aren't carefully researching the software beforehand; they see that the site can be edited and attack. So, the attribute is ineffective in reducing spam, while it is very effective in penalizing the Google rankings of honest sites. --Slowking Man 09:05, Feb 11, 2005 (UTC)

The sole motivation for adding an external link to a Wikipedia article should be to improve that article. If you give people extra incentives to add external links for other reasons, such as a Google ranking boost, then the level of spam will only rise as a result. Not to mention edit wars, where getting your site ranked in the top 10 means knocking off one of the sites that's already there.

If you check out the history of the 2004 Indian Ocean earthquake article, there was an enormous amount of activity in the external links section, with various anons overwriting and deleting the edits of previous anons, or merely reordering to put their external link in the first position of the external links list. It's bad enough that they do this just to get traffic... if there's a Google ranking boost incentive in it for them as well, then it'll get completely out of control. Not just spambots but zillions of little individual manual edits all over the place.

-- Curps 09:24, 11 Feb 2005 (UTC)

There's already been a Google ranking boost for a long time, and we don't seem to have done too poorly. Edit wars and such are a natural consequence of a Wiki's open nature, and, like I stated regarding spambots, the vast majority of those out to promote a site or sites either don't know about nofollow or don't care. Wikipedia is an excellent resource for, among other things, links, since they're peer-reviewed. I don't see why we should deny our work to search engines, if it can improve search results. --Slowking Man 09:37, Feb 11, 2005 (UTC)
I'm in complete agreement with Slowking Man. However, I acknowledge the potential future that Curps describes and I very much appreciate his protection of the PHP article. But for now, let's let the great resource that is Wikipedia participate in the process of aiding search engines and the ranking of quality web sites. — Stevie is the man! Talk | Contrib 18:36, 13 Feb 2005 (UTC)
Quality links? Where? I just went through about half of my watchlist, pruning external links. On the articles with significant numbers of external links, I removed about a third of them for being largely irrelevant to the subject, or not adding anything to the article. (For example, do we really need links to three "primer on GPS" sites?) --Carnildo 20:54, 13 Feb 2005 (UTC)
I think you just proved my point. Wikipedia editors continuously ensure that the links stay informative and useful. — Stevie is the man! Talk | Contrib 23:37, 13 Feb 2005 (UTC)
I think it proves the opposite: as far as I can tell, I'm the only person ever to trim the "External Links" sections in these articles. Normally, the occasional egregiously bad link or goatse troll gets removed, but that's it. --Carnildo 01:07, 14 Feb 2005 (UTC)
I do it all the time. And I've also not seen many articles where this is a major concern. — Stevie is the man! Talk | Contrib 04:01, 14 Feb 2005 (UTC)

"nofollow" removes a large part of the motivation for link spam though. Spammers do know which sites do and don't use it. As soon as they learn that Wikipedia is using it, they'll realise that spamming us won't help them, so they'll give up. And they will give up - spammers just won't go to the time and trouble of operating bots when they know they'll have no effect. Dan100 13:27, Feb 21, 2005 (UTC)

[edit] Clearer poll options

I think using "Keep" and "Remove" for the poll options will help prevent errors in voting. When I initially saw the page, I thought "Yes" was for putting nofollow in, not removing it. -- Cyrius| 19:39, 5 Feb 2005 (UTC)

Ah, thanks. I had that idea this morning, but wasn't able to access the Internet. --Slowking Man 05:05, Feb 6, 2005 (UTC)

I agree that "yes" and "no" are not the right choice names here. I'm not sure what allthe options are but the following look to me like appropriate names for the possible options:

  • "All links should be nofollow"
  • "Nofollow should be the default, with individual exceptions"
  • "Nofollow should be allowed, as an exception"
  • "Nofollow should not be used"

Even this is a little tricky, because "nofollow" is, itself, a negative. We should also have one paragraph at the head of each voting section explaining what the policy would mean. -- Jmabel | Talk 20:42, Feb 5, 2005 (UTC)

Methinks the MediaWiki technology would currently allow only the first and fourth choices to be implemented. We're not talking about all the wiki projects here, just the English Wikipedia, so even if the technology could handle it, I think we're talking about a setting that applies to the whole of the English Wikipedia. — Stevie is the man! Talk | Contrib 23:52, 5 Feb 2005 (UTC)
Correct, currently, all links on a Wiki either use nofollow or they don't. There's been discussion on the foundation-l mailing list and m:Nofollow of incorporating the use of the attribute into the proposed peer review system--ideas such as having nofollow added to links for a period of time after they're added to an article or allowing admins to control the application of nofollow--but that's all still in the future. --Slowking Man 05:05, Feb 6, 2005 (UTC)

[edit] Nonsense

This says that voting is open and will remain open until February 19, 2004, but that was nearly a year ago. This is ridiculous! Georgia guy 01:14, 12 Feb 2005 (UTC)

The only thing ridiculous is that you're making a big noise about a simple typo. -- Cyrius| 02:22, 12 Feb 2005 (UTC)

[edit] Extremely brief survey period

The survey opened today and remains open only for seven days. This is obviously much too brief a period of time in which to give a chance for all users having an opinion on a matter of policy to notice the poll, consider the merits, and make their votes. I urge that a longer discussion and voting period be considered. There is no pressing need to justify a single week, and it is far more important that an adequate section of views be sampled on this policy issue which will affect all external links and have effects beyond Wikipedia itself. The vote on Three revert rule enforcement lasted two weeks, and its effect was relatively minor. --Tony Sidaway|Talk 12:56, 12 Feb 2005 (UTC)

  • I agree with Tony that the voting period should be extented atlest one more week, to give more people time to think and cast a vote. kaal 02:52, 13 Feb 2005 (UTC)
  • Sounds fine to me. Anyone object? --Slowking Man 06:46, Feb 14, 2005 (UTC)
  • Yes i agree 1 week is far too short. I suggest an extension. --Sweets 22:40, 19 Feb 2005 (UTC)
    • I extended the voting deadline yesterday, since there have been no objections. --Slowking Man 00:28, Feb 20, 2005 (UTC)

In light of the outage, the extension was a very good idea. – Quadell (talk) (sleuth) 00:45, Feb 23, 2005 (UTC)

[edit] Reasons for removing?

I haven't heard many resons to remove the "rel=nofollow". I've heard reasons to keep it (it helps prevent linkspam), and rebuttals (no it doesn't), and re-rebuttals (oh yes it does), but what harm is the suffix doing? If it might prevent linkspam, even a little bit, it's a good thing, right?

I've heard the argument that it's not standard HTML, and I think that's been pretty well rebutted. I've heard the argument that no one uses it, and I think that's been successfully rebutted as well. And I've heard the argument that we should "reward" good sites that we link to by increasing their page rank, and I'm not convinced by that. So what other arguments are there for removing it? (The argument "there's no reason to keep it" doesn't count.) – Quadell (talk) (sleuth) 00:50, Feb 23, 2005 (UTC)

I'm convinced by the "page rank argument" because using nofollow essentially withdraws Wikipedia from the web, in that all the "search information" it provides will be totally lost. And I think that would suck. — Stevie is the man! Talk | Contrib 01:38, 27 Feb 2005 (UTC)

[edit] A people divided

Although I'm heartened that the vote came out the way I desired, I'm also concerned at how divided the community is on this matter. Pure democracy can lead to bitterness and even revolt — we don't want half our editors to fork the project and start Wikipedia Nofollow. I think we should give very careful attention to a good compromise that a larger proportion of people can agree on. Deco 08:08, 1 Mar 2005 (UTC)

I agree, which is why I haven't tried to push anything. I don't have a solution, really, other than coding one of some of the software solutions that have been tossed around. I don't know PHP or SQL, so that's out of the question for at least me. There was no discussion in #mediawiki (the development channel) when I annouced that the vote had concluded, although it was a bit late (many of the developers are European or Australian). In #wikipedia, the only comment came from Fennec, who said that it might have to be a Board decision. I wouldn't be against asking Jimbo via e-mail to give his opinion on the results, if others think that would be a good idea. --Slowking Man 20:54, Mar 5, 2005 (UTC)
I think this is a very contentious issue, obviously. But to me, the question is: what is the status quo. In this case, I believe the status quo was "remove". The feature was added only one month ago, and without any user input into the issue. I would very much support asking the Board and Jimbo for input in this case, since it goes beyond the issue at hand to the broader issue of how policy in such cases should be decided. -- Decumanus 21:18, 2005 Mar 5 (UTC)

[edit] nofollow for older revisions?

I oppose nofollow for the main revisions of a page, but I think having it for historical revisions will mean that any spamming that ends up in the page history as cruft would be ignored, e.g. on [3]. Any good links should be preserved in the filtered article. Thoughts? Dunc| 22:43, 17 Mar 2005 (UTC)

I'm fairly sure older revisions aren't indexed by Google or anyone else, so it isn't needed. --Carnildo 23:29, 17 Mar 2005 (UTC)