HITS algorithm
From Wikipedia, the free encyclopedia
Hypertext Induced Topic Selection (HITS) is a link analysis algorithm that rates Web pages for their authority and hub values. Authority value estimates the value of the content of the page; hub value estimates the value of its links to other pages. These values can be used to rank Web search results.
HITS was developed by Jon Kleinberg.
HITS uses two values for each page, the authority value and the hub value. Authority and hub values are defined in terms of one another in a mutual recursion. An authority value is computed as the sum of the scaled hub values that point to that page. A hub value is the sum of the scaled authority values of the pages it points to. Relevance of the linked pages is also considered in some implementations.
HITS, like Page and Brin's PageRank, is an iterative algorithm based purely on the linkage of the documents on the web. However it does have some major differences:
- It is executed at query time, and not at indexing time, with the associated hit on performance that accompanies query-time processing.
- It is not commonly used by search engines. (Though some sources claim a similar algorithm is used by Ask.com.)
- It computes two scores per document (hub and authority) as opposed to a single score.
- It is processed on a small subset of ‘relevant’ documents, not all documents as was the case with PageRank.
[edit] See also
[edit] References
- J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proc. Ninth Ann. ACM-SIAM Symp. Discrete Algorithms, pages 668-677, ACM Press, New York, 1998.[1](PDF)