Grub (search engine)

From Wikipedia, the free encyclopedia

Grub was the name for a search engine acquired by LookSmart based on distributed computing. According to grub.looksmart.com, the Grub crawling project is no longer operational.

Users could download the grubclient software and let it run during computer idle time. The client indexed URLs and sent them back to the main grub server in a highly compressed form. The collective cache could then be searched on the Grub website. Grub was able to quickly build a large cache by asking thousands of clients to cache a small portion of the web each.

Though many believed in Grub's distributed computing system, the search engine had its share of opponents. Many stated that a large cache is not the strength of a good search engine, rather, that it is the ability to deliver accurate, precise results to users. Loyal fans of Google stated that they enjoy that search engine for its targeted results and would not switch to Grub unless its search technology were superior to Google's. Quite a few webmasters were opposed to Grub for its apparent ignorance of sites' robots.txt files. These files can prevent robots from caching certain areas. Because Grub, as its developers claimed, also caches robots.txt, changes to the file may not be detected. Webmasters counter that Grub does not understand long-lasting robots.txt files blocking access to all crawlers. According to Wikipedia's own webmasters, the /w/ directory, which stores the scripts for page-editing, etc. and is blocked to robots by robots.txt, was cached by Grub but no other search engine. Wikipedia's webmasters also complained that Grub's distributed architecture created server overload by keeping open a large number of TCP connections — the effects of this were essentially the same as a typical distributed denial of service attack.