October 24-26, 2007
Toronto, Ontario, Canada

Paper: The National Digital Newspaper Program: Increasing Access to Historical Newspapers through Digital Reformatting

Barbara Taranto, The New York Public Library, United States

Abstract

Six regional libraries were awarded two-year grants to help develop the initial test bed for the newspaper digital repository. This briefing will discuss some of the access and cultural lessons learned during this first award period; and some of the reasons The New York Public Library is continuing to work whole-heartedly in this area of developing interest.

Keywords: newspapers, historical, microfilm, national, preservation standards

Introduction

Since it was founded in 1895, The New York Public Library (NYPL) has been committed to building and preserving its distinctive collections and providing free and open access to these resources for the benefit of all who come in search of them. Built on a strong belief in the value of independence of thought, freedom of information, and free trade in ideas, the Library fosters an environment that enables each individual to pursue learning at his or her own personal level of interest, ability, and desire.

The collections of newspapers throughout The New York Public Library are among its most valuable and heavily used holdings. Large and frequently unique back-files constitute a rich mine of data and opinion for the historian as well as the general researcher. Many of the Library’s newspaper collections have been microfilmed for posterity. However, film may only be accessed onsite with special equipment, and is often regarded as difficult to use. While it is the currently accepted preservation standard, microfilm presents a number of limitations for researchers. Digitization addresses a number of these issues including ease of searching, readability and transferability.  It is the first step to providing remote, worldwide access to invaluable content. Scanning of text materials, like newspapers, allows the information to be comprehensively and efficiently searched by keyword, a function that does may not exist with all types of microfilm. In addition, the quality of digital materials is unaffected by the amount of user access.

As standards emerge for the long-term preservation of digital formats, it is becoming clear that transferring content from microfilm to the digital medium will provide numerous benefits to current and future researchers.

The National Digital Newspaper Program

In consort with National Endowment for the Humanities and the Library of Congress, the New York Public Library is creating a very large scale digital archive of historical newspapers, The New York Public Library is contributing not only its extensive holdings but also its technical expertise in preservation and the creation of digital archives.

New York’s newspapers have borne witness to a rich tapestry of historical events. Highlights of important political, social, and cultural news that occurred within the 1836-1922 time frame of the National Digital Newspaper Project are now available in digital format with a particular emphasis on the decade of 1900-1910.

The New York Public Library’s holdings of State newspapers, in print and on microform, are extensive, with many titles that were collected solely by the Library or by only a few other libraries. While newspaper collections are dispersed throughout the Library’s research divisions, a significant portion is located in the Humanities and Social Sciences Library (HSSL), one of the four Research Libraries of The New York Public Library system. The Humanities and Social Sciences Library’s newspaper collection includes more than 350,000 reels of microfilm, tens of thousands of bound volumes, and thousands of current titles from all over the world.

The Humanities and Social Sciences Library maintains comprehensive collections of most general New York City newspapers in English and other languages. Notable collections include, among others, New York Commercial, New York Post, New York Times, and El DiarioLa Prensa. Among local and neighborhood newspapers, HSSL holds complete microfilm collections of the Villager (Manhattan) and The Village Voice (Greenwich Village), while the Amsterdam News (Harlem) is accessible at the Schomburg Center for Research in Black Culture. In addition, HSSL provides access to representative U.S. newspapers in many languages from major metropolitan areas, to international newspapers with at least one title from every country when possible, and to other selected newspapers. A large number of newspapers have been preserved on microfilm by the Library’s own in-house Goldsmith Preservation Laboratory, which can be considered the premier in-house lab in the United States dealing exclusively with preservation microfilm.

Access to New York State newspapers has been significantly enhanced by the NEH sponsored U.S. Newspaper Project (USNP), which, among other objectives, preserves historical newspapers on microfilm. Between 1988 and 1993, The New York Public Library preserved the content of nearly 500 U.S. newspaper titles with NEH funding. The titles were selected from the holdings of the NYPL Newspaper Collection, the Jewish, Asian and Middle Eastern, Slavic and Baltic language divisions, the Rare Books and Manuscripts Division, the Schomburg Center for Research in Black Culture, and the General Research Division. In addition to the microfilming preservation work, the Library also replaced the severely deteriorated holdings of more than 400 newspaper titles with microfilm produced elsewhere. In addition, the Library cataloged approximately 7,000 titles, creating 10,000 local data records in the process? an estimated 20 to 30 percent of these titles were unique titles held only at the Library. At the conclusion of this extensive effort, the Library saw a dramatic increase in the use of the newspaper collection at the Library. The Library has also contributed to the New York Newspaper Project administered by the New York State Library, which will be completed in 2006. The New York Newspaper Project is making inventory of, cataloging, and preserving New York newspaper collections throughout the state. Each title is cataloged to CONSER standards, descriptive information is added to the OCLC Union Catalog, and the holdings are entered into the USNP Union List. The project is also filming newspapers that meet criteria for research value, physical condition, and length and completeness of holdings.

In 2005, The New York Public Library was one of only five institutions selected to partner with the National Endowment for the Humanities and the Library of Congress to digitize newspapers from microfilm to create a national on-line archive of historical newspapers. Since its project began, the Library has digitized nearly 100,000 pages of its goal of 100,253 pages from the 1900-1910 issues of The Sun and the Evening World. Each newspaper page is scanned in several image and text formats, including Optical Character Recognition (OCR). OCR and metadata created in the project will enable searches of the text and page images.

In March 2007 The Library of Congress published its newspaper portal Chronicling America (http://www.loc.gov/chroniclingamerica/). The service now provides access to newspapers from 7 institutions including the New York Public Library. As an outgrowth of the USNP, the National Digital Newspaper Program (NDNP) has built upon its union catalog of CONSER records to also supply a searchable database to expose microfilm newspaper holdings across the 50 United States.

There have been many lessons learned in the first two years of the NDNP pilot project. Chief among these lessons is that researchers and educators are tremendously interested and excited about the digitization and searchable access to historical newspapers content. Second, and perhaps no less interesting from a management point of view is that based on partners’ testimony, the transfer and transmission of large digital datasets is not an insignificant task and that any considered effort to digitize and serve significant digital content on the web has serious organizational and operational implications. Third, and perhaps most surprising is the hunger of the general public for rich, deep and significant content that can be that can be delivered at the point of need; and repurposed for a whole host of cultural, commercial and artistic activities.

Acknowledgements

The Library wishes too thank The National Endowment for the Humanities and the Library of Congress for its leadership on the National Digital Newspaper Program.

Cite as:

Taranto, B., The National Digital Newspaper Program: Increasing Access to Historical Newspapers through Digital Reformatting, in International Cultural Heritage Informatics Meeting (ICHIM07): Proceedings, J. Trant and D. Bearman (eds). Toronto: Archives & Museum Informatics. 2007. Published October 24, 2007 at http://www.archimuse.com/ichim07/papers/taranto/taranto.html