SULAIR Has Robot for Digitizing Books
by Eleanor Brown, David Noll, and Stuart Snydman
The Stanford University Libraries & Academic Information Resources (SULAIR) operates a robotic page-turning and scanning device for the mass digitization of bound print materials, the first of its kind in the world. Called the Digitizing Line (DL), this book scanning device is the centerpiece of SULAIR's broad array of on-campus digitization capabilities. The ultimate output of the DL is a searchable e-book, in PDF Image and Text format that will be made available to the Stanford community to support teaching and research.
How Does It Work?
The Digitizing Line, made by 4DigitalBooks, is an automated (robotic) book scanner that produces high quality digital images of bound materials at throughput rates as high as 1160 pages per hour. (This is five times the rate of scanning books manually.) The machine can turn the pages of both small and large books as well as bound newspaper volumes. Since it can gently turn and flatten individual pages, the DL allows books to be scanned without removing the binding. Equipped with an I2S digital camera, this scanning device can produce preservation quality black and white, grayscale and color TIFF images at up to 600 dpi.
See SULAIR's Robotic Book Scanning web site for a complete description of the digitization process, from ensuring that the material can be scanned (e.g., it's not to fragile or extraordinarily valuable), to data entry and scanning and, finally, converting the scanned material to an e-book.
Projects Using the Digital Line
Since the Digital Line began operation in March 2003, a variety of projects have used the robotic book scanner, as indicated in the following descriptions.
Stanford University Libraries' Books in the Public Domain: This project is an ongoing initiative to digitize high-circulating books held by the Stanford University Libraries that are in the public domain. The goal is to provide Stanford readers with a digital option for public domain materials that have been frequently used in the past.
SULAIR staff analyzed Stanford circulation records for books published before 1923, and thus in the public domain, to identify older titles with the highest circulation. For these older "top circulators" they assessed the feasibility of quick digitization. Material found to be fragile, or part of a large set that could not be accommodated in an initial pilot program, was not digitized at this time. Since the major criterion is Stanford readers' needs as expressed directly by past circulation records, SULAIR has not made any attempt in this pilot project to assess whether other editions might be preferred for other reasons. The first sample of titles digitized as part of this effort can be found at Stanford's ebrary site.
Medieval and Modern Thought Text Digitization Project: The goal of this project is to digitize on an ongoing basis 25,000+ pages per year of printed reference works, source collections, and primary and secondary books in the broad area of medieval and modern thought. Current local research needs determine the material selected for digitization. A facsimile of the work and searchable text will be created, cataloged in Socrates, and delivered over the web.
Visit the project's web site for more information. You can view the e-books digitized for this project at http://site.ebrary.com/lib/medievalandmodern/.
CSLI Linguistics and Philosophy: The DLP partnered with the Center for the Study of Linguistics and Information (CSLI) to conduct the very first live production digitization project on the robotic book-scanner. The result of this effort was the digitization of approximately fifty titles published by CSLI, which are now available via Stanford's ebrary site.
CSLI Publications reports new developments in the study of language, information, logic, and computation. They publish books, lecture notes, monographs, technical reports, working papers, and conference proceedings.
Atomic Energy Commission: This project involves the digitization of all congressional hearings and committee prints of the Joint Committee on Atomic Energy. It is a pilot project to test the ability of the robotic book-scanner to digitize bound government documents.
The Joint Committee on Atomic Energy existed from 1946-1977. The committee was created to "make continuing studies of the activities of the Atomic Energy Commission (AEC) and of problems relating to the development, use, and control of atomic energy." Through hearings and other public informational activities, the committee played a significant role in encouraging peacetime uses of atomic energy.
More Information
Please visit the Robotic Book Scanner web site where you can learn more about this unique technology.
Note that the Robotic Book Scanning Lab cannot currently accommodate individual requests for digitization services. Please contact the library subject specialist for your disciplinary area if you have ideas for a digitization project that would utilize this lab. For more information on how to start a digitization project, please click here.

