As we move forward with the Google digitization project and purchase additional digital content, Stanford University Libraries and Academic Information Resources (SULAIR) plans to implement new indexing, searching and browsing tools that will allow our users to take full advantage of digital-format materials. Here, we describe and provide examples of many of the search tools and technologies that are currently being investigated or considered by SULAIR to expand the capabilities of Stanford students and faculty in working with digital materials.
Google Book Search
Stanford's participation in Google Book Search (GBS) provides benefits at the outset. Circulation of Stanford's library books increased by close to 50 percent after digitization of the card catalog, and that gave access to only a small portion of the text of the book. We anticipate that the full-text search capability that GBS is beginning to provide to our collection will increase usage even further, and dramatically improve users ability to pinpoint the exact chapter or page that they need within a text.
This is true even for books that cannot be viewed on GBS due to copyright restrictions. Users may have to come to the library to physically retrieve a book, but they'll have a much better sense of which books, and which portions of those books, relate to their work. Since Google Book Search allows them to search across multiple libraries, they can also more readily identify materials to request through Interlibrary Services. GBS gives users a familiar, comfortable interface, and also provides added features, including extractions of popular passages, lists of references from Web pages, and lists of other books referencing a work.
But Google Book Search is not without issues. It offers a single search technique, with limited tools for refining searches and an opaque methodology, and it has issues of accuracy and precision in searching. However, SULAIR is not just relying on Google to provide access to these digital materials, but is also acquiring a local copy of the digital documents. We will be able to bring those materials together with other electronic materials we have purchased, licensed, or digitized ourselves, and provide a broad spectrum of search and discovery tools for those materials.
Reading
SULAIR provides a variety of e-book research tools for reading the thousands of electronic books and journals already in our collection.
These readers, as well as the several portable readers on the market, all use proprietary software that limit the book to a single reader. Going forward, we hope to develop or acquire a single reader that will provide access not just to this limited set of materials, but the full spectrum of digital books in SULAIR.
Browsing
Users often indicate that they find books easier to browse in hard copy rather than electronic form. But Stanford's Socrates catalog already offers a shelf browsing feature. If you have the call number of a book of interest, you can use the "Call Number Browse" link to see other books shelved near it. Find it on the upper right of the Advanced Search screen in Socrates.
Subject browsing is greatly simplified using Web services, and allows the use of both standard vocabularies and user-generated tags. Socrates currently allows browsing by LC headings, and we'd like to implement other controlled vocabularies, such as MESH, as appropriate. Good experiments with user-generated tags can be found in the University of Pennsylvania's PennTags system and DLFAquifer. In addition, graphical navigation tools offer new options for browsing. Here at Stanford, we have already implemented this feature at HighWire Press, via a topic map. The topic map is available from any individual search result on HighWire. Another example, currently in use outside of Stanford, is the Aqua browser, which links related terms.
Indexing
Taxonomic indexing, which we have implemented at HighWire Press, is a tool that extracts indexing terms directly from the texts, and then assesses relationships among the terms. It is this tool that allows the HighWire topic map to function. Because it is developed from the texts itself, it is less structured than traditional controlled vocabulary terms. However, the two are not mutually exclusive, and each offers advantages.
Searching
Socrates-style searching is a given in our libraries, but we know we can do more. We know that our users find it frustrating to learn the multiple search interfaces of the more than 800 databases we currently make available, and we're actively investigating federated search technologies, which will allow users to search multiple databases with a single interface. Try the beta version of the tool that searches the 10 most popular databases in our collection. See also Find High Quality Information More Quickly: Use Federated Search Prototypes Developed for Stanford in this issue.
For searches of the large stores of electronic texts we're developing, we're examining Associative Search. Associative search looks at the relationships between words in large bodies of text, and can quickly and easily locate documents that are related to one you have in hand. Try this sample of associative search. Note that associative search works best when searching a large amount of text. Try copying the text of a news article into the search engine.
Web-based Services
Finally, with Web-based services, we can provide functionality that would be impossible with hard copy books. Tools we're currently examining include:
- Hyperlinking citations to cited references (already in place at HighWire Press)
- Alerting (HighWire Press)
- Recommendations (Amazon)
- Virtual post-it notes
We welcome input on the usefulness (or lack thereof) of each of these items, and also on the features and tools you'd like to see. Please send comments to mcalter@stanford.edu.

