Talk:HITS algorithm

From Wikipedia, the free encyclopedia

"It is executed at query time, and not at indexing time, with the associated hit on performance that accompanies query-time processing"

The algorithm can also be carried out in a transient manner like Google. Is this a difference at all? (unsigned comment by User:59.95.4.160 2007-05-03T08:24:09)

Perhaps the article isn't clear. PageRank is a query-independent calculation over the entire crawl which can be performed in batch mode. The ranking of results for a particular query is a function of the page's PageRank (which is independent of the query) and various query-dependent measures such as TFIDF. HITS is performed after a set of pages has been selected using TFIDF or whatever, and works on the link structure within that set, calculating the "authority" and "hub" score relative to the query; something that is an authority for baseball is unlikely to be an authority for fettuccine. You could of course run HITS on the whole crawl, or PageRank on a subset, but that is not how they are designed to be used. --Macrakis 13:14, 3 May 2007 (UTC)