TrustRank

TrustRank is a link analysis technique described in the paper Combating Web Spam with TrustRank by researchers Zoltan Gyongyi and Hector Garcia-Molina of Stanford University and Jan Pedersen of Yahoo!. The technique is used for semi-automatic separation of useful webpages from spam.

Many web spam pages are created only with the intention of misleading search engines. These pages, chiefly created for commercial reasons, use various techniques to achieve higher-than-deserved rankings on the search engines' result pages. While human experts can easily identify spam, a manual review of the Internet is impractical.

One popular method for improving rankings is to increase the perceived importance of a document through complex linking schemes. Google's PageRank and other search ranking algorithms have been subjected to such manipulation.

TrustRank seeks to combat spam by filtering the web based upon reliability. The method calls for selecting a small set of seed pages to be evaluated by an expert. Once the reputable seed pages are manually identified, a crawl extending outward from the seed set seeks out similarly reliable and trustworthy pages. TrustRank's reliability diminishes with increased distance between documents and the seed set.

The logic works in the opposite way as well, which is called Anti-Trust Rank. The closer a site is to spam resources, the more likely it is to be spam as well.[1]

The researchers who proposed the TrustRank methodology have continued to refine their work by evaluating related topics, such as measuring spam mass.

See also

References

  1. Krishnan, Vijay; Raj, Rashmi. "Web Spam Detection with Anti-Trust Rank" (PDF). Stanford University. Retrieved 11 January 2015.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.