on Tuesday unveiled an ambitious plan to bring information locked away in print form in libraries into its searchable index.
As previously reported by internetnews.com, Google has developed a method of scanning printed material so it can be used to answer searchers’ queries along with Web content. Mountain View Calif.-based Google hopes to patent its proprietary method of scanning books and creating a searchable database.
The service went live Monday night with snippets from a few books. Google has already begun the digitization process in collaboration with the University of Michigan, Harvard, Stanford and Oxford Universities, and the New York City Public Library.
“The goal is to unlock the wealth of information that’s only available offline and bring it online,” said Google Director of Product Management Susan Wojcicki. “We believe we’ll make information accessible that previously was not available to users, for example, out-of-print books only available on the library shelf now will be available to Google users.” The project will take years to complete, she said.
As they become available, print material will be incorporated into regular Web searches. Whenever a book contains content that matches the search terms, links to that book will be included in the search results. Searchers can click on the book title and see the page that contains the search terms and other information about the book. They can then search for other topics within that book.
For books that are in print and under copyright, search results will include links to where the books can be purchased, as well as whether they’re available in a local library. Wojcicki said this will give books increased visibility and can potentially increase book sales.
“We have agreements with publishers where we are able to show a certain percentage of the book,” said Google product manager Adam Smith. When it has the rights, Google will show the actual page as it looks in print. For books that are in the public domain, Google will show the entire book if it exists in the database. For books under copyright where Google does not have an agreement in place, the search results will show bibliographic information for the book or a small snippet showing where in the book the answer to the query was located.
Google will earn revenue by showing ads against the print search results, just as it does for Web results, and book publishers will receive a share of the revenue. Google will not receive revenue from any sales generated by links to publishers’ sites.
Wojcicki and Smith said they could not comment about an automated method to gain permission to display text from a copyrighted work. The patent application on digitizing print material also covers a method for executing a permission protocol so that the publisher could authorize Google to display more text from the relevant publication.
Wojcicki said that adding journals and magazines to the database was potentially in the future. Google’s patent application also includes a way to permit “subscription-like access” to the electronic content.
In other news, Google and Geico Insurance squared off in federal court on Monday. Geico is suing Google for trademark infringement. Geico complained that Google’s practice of selling its name as a key word against which to display advertising is an abuse of its trademark.