eTBLAST

From Wikipedia, the free encyclopedia


eTBLAST is a text similarity search engine currently offering access to the MEDLINE database, the National Institutes of Health (NIH) CRISP database, the Institute of Physics (IOP) database, and the NASA technical reports database. It is continuously expanding with additional text-based databases. The eTBLAST server compares a user's natural text query to target databases using a hybrid search algorithm consisting of a low-sensitivity weighted keyword-based first pass followed by a novel sentence-alignment based second pass. eTBLAST is a free web-based service of The Innovation Laboratory at the University of Texas Southwestern Medical School.

eTBLAST, as a text similarity engine, made possible a large study of duplicate publications and potential plagiarisms in the biomedical literature. Thousands of random samples of Medline abstracts were submitted to eTBLAST, and those with the highest similarity were studied and entered into a on-line database. This study is on-going, with the database maturing as the entries are manually inspected and classified. This work revealed several trends, including an increasing rate of duplication in the biomedical literature, as reported in the journals Bioinformatics and Nature.

[edit] Interface

Because eTBLAST is a text-similarity engine rather than a simple keyword-based search tool, it is claimed that the user need not identify and manipulate query keywords and Boolean operators, as must be done for other search engines.

eTBLAST aims to help the user rapidly to find references, evaluate novelty, find experts and journals in a given topical area and track the popularity of the topic as defined by the user’s query.

A typical query of 100 words takes 1-2 minutes to return results after a comparison to MEDLINE that as of 1/1/2007 contains over 16 million records.

[edit] References

[edit] External links