Book scanning
From Wikipedia, the free encyclopedia
Book scanning is the process of converting physical books into electronic ones. For physical books to be turned into digital books, they must be scanned and then have OCR or some other method applied to them to make the images into text. This is the base of projects like Project Gutenberg, Google Books, and the Open Content Alliance.
One of the main challenges to this is the sheer volume of books that must be scanned, expected to be in the tens of millions. all of these must be scanned and then made searchable online for the public to use as a universal library. Currently, many books are scanned by low cost labor in India or China. Other methods are using robots to flip book pages as well as cutting off the books spine and scanning the pages in an automatic scanner. The downside to this is that the book could be destroyed. Once it is scanned, the data is either entered manually or via OCR. This is a major cost of the book scanning projects.
Due to copyright issues, most scanned books are those that are out of copyright but Google Books is scanning books in copyright unless the publisher specifically excludes them.
[edit] External links
- Wired Article on Amazon Book Scanning
- USA Today article on robotic book scanning machine
- New York Times article on book scanning and the universal library
- Quality criteria and testing procedures for book scanners