![]() |
|
The digital environment is a very dynamic one. Organizations that serve today's scholars must continuously update their approaches to take advantage of new technological capabilities or, in some cases, simply to keep pace. This is particularly important at an institution like JSTOR with the responsibility for ensuring that researchers have reliable access to scholarship not only today, but over the very long-term.
Over the past ten years, JSTOR has assembled a tremendous body of literature in the archive—more than 17 million pages of content. During this time, our specifications for the conversion and display of material have evolved. While the material JSTOR digitized in its earliest days remains accessible and useful to scholars, we recognize the need to revisit this "older" data and to bring it up to current specifications. Given the amount of material archived by JSTOR today, this is no small task. It is also an ongoing one. Even as we turn our attention to converting material done in 1998 to 2005 standards, we know that this process will be required again in several years' time, and continuously thereafter.
To take on this challenge, JSTOR is developing a team of staff dedicated to planning, implementing, and overseeing what we call retrospective or "retro" projects. JSTOR has successfully undertaken and completed a small number of retro projects in the past, and we currently have a number of projects underway. Having this team in place will enable us to pursue these efforts more quickly, while ensuring that our ongoing work of digitizing content for new collections and the annual moving wall flip continues apace.
This team, anchored by several experienced members from our Production unit, is being staffed up. One of their first charges will be to reprocess all of the titles digitized prior to 2002 to bring them up to our current specifications. This will include the introduction of image compositing and Unicode encoding and transliterations, among other features. Image compositing, already available for titles released in the past two years, allows illustration images to be "pasted" to the full black-and-white page image, thereby creating a more faithful replication of the original page from the paper edition. Similarly, Unicode encoding and transliteration (usually done in accord with Library of Congress transliteration specifications) enable users to view author, title, and abstract information that contain non-ASCII characters in their original form (for example, in Arabic script), and to search these same characters using a standard keyboard. More information about the use of Unicode in JSTOR can be found in the March 2003 issue of JSTORNEWS.
We have already started to prepare for this massive undertaking by working on a pilot group of titles. These include Science, as well as the 13 literature journals available in the Arts & Sciences I Collection. In the pilot, we are learning about the challenges inherent in the process, building a new workflow, and developing the tools necessary to replace the pre-existing data in our systems with the re-worked material without interruption to users. While we do not yet have a firm release date for this initial set of content, we do expect it will be available before the end of 2005.
A second leading "retro" initiative is also getting underway. In 2005, we will begin to implement reference linking from articles within the JSTOR archive. This means that a user reading an article in JSTOR will be able to follow linked references to other articles available within JSTOR or, in some cases, to other online resources. This will apply both to the 17 million pages of material already digitized by JSTOR as well as to new material digitized in the future.
This project is complex and will take several years to complete. One of the first and most difficult steps is to locate each reference on the page images and then to isolate or "parse" the constituent parts (i.e. journal, article title, page number, etc.). This data will be used to match and link the citation to the article being referenced. Initially, we expect to provide users with links to cited journal articles within the JSTOR archive. In a later phase of the project, we will be linking to other online resources. We also plan to capture information about other forms of content being cited (monographs, audio files, etc.) with the expectation that more types of historical material will become available electronically, and therefore will be able to be linked, in the future.
We are excited to move these retro projects forward, as they will greatly enhance the usability of the literature in the JSTOR archive and, in some instances, create an even more faithful replication of the original published works.
Last updated on September 8, 2006
©2000-2007 JSTOR