User:Xiaogiabot
From Wikipedia, the free encyclopedia
This is the user page of bot maintained by xiaogia User:Xiaogia.
I am doing a school assignment where I need to configure Heritrix web crawler to retrive pages from Wikipedia. I read Wikipedia:Bots and it says that I need to approval on Wikipedia:Bots talk. I am confused whether I need to get approval for the web crawler engine.
This is the information of the Bot:
- The bot is automatic. I configured the URL to point to a page in Wikipedia.
- It should run from Jan 25 - Mar 31 2005.
- Heritrix from Internet Archive's Heritrix homepage. It is a Java program.
- Purpose:
- I need to crawl a topic to retrive the pages. The purpose is to preserve the topic for future use.
- I notice that for every page there is a history that shows the history page. But, the purpose that I am doing this is for web archiving purpose. This is to archiving one topic and show a prototype of how this can be done. Wikipedia must allow me to crawl because I need this to accomplish my assignment. Please.
The user page for my bot User:Xiaogiabot. User:Xiaogia 08:49, 25 January 2006 (UTC)