Scanning 25 million pages is a big job. Scanning the collection of the prestigious British Library is a plum job — and Microsoft has snagged it.
On Friday, Microsoft
and the British Library announced a long-term strategic partnership to digitize 25 million pages from the library’s collections in 2006.
The partnership will add approximately 100,000 books to the new MSN Book Search service, announced last month, with an initial public beta expected next year.
Microsoft and the British Library said they would only scan works in the public domain. And the partnership covers not only scanning and indexing books, but also working to create the software infrastructure for the National Digital Library, an initiative announced in June.
Microsoft will help build a Digital Object Management system that will allow the library to keep track of and provide access to a wide range of non-book content, such as electronic journals, e-books and CD-ROMs in the collection.
In a statement, Lynne Brindley, chief executive of the British Library, said, “We are redefining the library in our development of the National Digital Library and are delighted to be working with Microsoft on a key part of this project. Our aim is to provide perpetual access to the intellectual output of the nation, which is increasingly digital.”
Old-fashioned books are the hot new thing in the search race, with book-scanning efforts announced this week by Amazon.com
and Random House.
Yahoo also forged a partnership with the Open Content Alliance, which is a consortium of libraries, publishers and technology companies working with the Internet Archive to create an online trove of book content that will be available for all to access.
Google began it all in December 2004, with the launch of Google Print and Google Library. Those efforts turned bleeding-edge when publishers protested Google’s scanning of library books without their permission.
Google maintains that its activities are fair use, because it only shows users snippets from books that it did not get publishers’ permission to scan. The Authors Guild and the Association of American Publishers sued Google for copyright infringement.