DKP Introduction: I noted yesterday that the National Endowment for the Humanities recently awarded Jon Frey, Assistant Professor of Art History & Visual Culture at Michigan State University, a major grant for the digital implementation of an open-source application known as the Archaeological Resource Cataloging System (ARCS). I asked Jon more about what his teams have been doing at Isthmia and what they hope to accomplish with the grant. He kindly agreed to provide the following overview of the work of Michigan State University and Ohio State University in recent years.
First of all, thanks to David for inviting me to post to Corinthian Matters as the forum he has created gives me an opportunity to write more candidly about our efforts to build an online collaborative workspace for the utilization and organization of digitized archaeological documentation. I tend to feel a bit awkward trying to describe this project more formally as if it has always followed a linear research plan with clearly defined goals and expectations. Rather, in the spirit of a weekend DIY project—and I think ARCS fits into that category in many respects—I’ve been learning as I go, largely through trial and error, but also through the helpful advice of far more experienced neighbors in what I have found to be a very welcoming and encouraging digital archaeological community. This is very much a good thing, as my own feelings about this project oscillate at unpredictable intervals between the fear that ARCS is nothing new (“good for you, you built a VRE!”) and the hope that this project will enable many smaller archaeological projects to share their evidence in a way that respects both their limited resources and the unique ways in which they have organized their recording systems.
History of the Project
The project as a whole began over five years ago with the digitization of notebooks at the Ohio State University Excavations at Isthmia. Yet far from following a clearly defined, institutional plan, this project served a much less lofty, personal goal. More than anything else, I was tired of returning to America at the end of the summer only to discover that I had failed to record a key piece of information and would have to wait until the following season to continue my research. By keeping all of these notebooks on a hard drive, I could eliminate this problem. At some point though, it became apparent that by relying on digital copies of these documents, I had effectively removed them from the information network in which they had been designed to function. This is because the document archive at Isthmia—as at most excavations and surveys—is essentially an analog form of a relational database. Depending on their research question, individuals may consult field diaries, photographs, maps, drawings, descriptions of individual artifacts, or informal reports, all of which, ideally, reference one another according to a pre-determined system.
Figure. Working at the Isthmia archives
Such systems have been refined over decades and have become quite effective at aiding in the retrieval of information, but are not without their inefficiencies and idiosyncrasies. As the work of individuals who are at different levels of experience—frequently the case at projects that also serve as field schools—certain documents may be incomplete or contain errors. Moreover, as artifacts themselves, archaeological records may deteriorate, be misplaced or become lost altogether. Thus, as most archaeologists know, gathering primary information is typically an immersive experience that requires as much time-consuming physical activity as mental. Moreover, most are also familiar with the fact that such archival work rarely reaches a successful conclusion without the helpful intervention of another, more experienced individual who is familiar with all of the peculiarities of a project’s documentation system.
Bearing all this in mind, I soon became interested in exploring how one might build a digital version of an archaeological archive that improves upon this system rather than replaces it altogether. A brief survey of other digital archaeology projects and services revealed a number of ongoing efforts to address related issues, but such initiatives appeared to be more concerned with the standardization and secure storage of archival quality digital data than with the utilization of that data in a virtual research environment. In addition, the use of such services was significantly easier for projects that had been “born digital” or possessed the financial resources to employ full time archivists or independent companies to digitize their entire archive at once.
As a result, with colleagues at the MSU College of Arts and Letters Academic Technology Office I began to develop an open source solution that would allow an archaeological project to create a digital workspace where documents could be collected, curated and shared according to an organizational scheme defined by the individual project. With the assistance of an NEH Digital Humanities Startup Grant in 2011, we created the Archaeological Resource Cataloging System (ARCS), which can be accessed at the present moment at http://arcs.cal.msu.edu
The goals outlined in the NEH proposal seemed modest at the time, but in hindsight, were too ambitious. We offered to build a program that would:
- Interface with Digital Asset Management systems like ResourceSpace and Omeka
- Work on PC and mobile devices
- Be easily modified to suit different archaeological projects
- Allow a variety of file types and data types
- Augment but not replace digitized documents through the use keyword tags and links to stable URIs.
- Be open-source and free to use
As the project began, we soon learned that we could not reasonably achieve the first two objectives within the grant period. Thus we resorted to the creation of our own database and optimized the site to work best on PC devices running Google Chrome. In addition, the complexities involved in building a version of ARCS to be tested using data from Isthmia made it difficult to maintain a separate, project non-specific source code. There were also a number of issues that we discovered we needed to address before ARCS could become a useful system. To begin with, there was the question of who exactly would be carrying out the work of uploading and curating the information. Then there was the question of what metadata standard and terminology we would use in order to make the documents presented through ARCS easily searchable and relatable to other resources.
In order to address the labor issue, we adopted a “crowd-sourcing” approach, but this presented its own challenges. A great deal of time was devoted to devising and implementing the type of user access and control measures that are typical of all digital projects that have resorted to volunteer workers to achieve their goals. The metadata issue was less easily solved. While Dublin Core appeared to be the best solution, we soon discovered that this schema did not apply to archaeological documentation as well as we would have hoped. Quite often the 15 core elements had to be translated into descriptive categories at Isthmia that merely seemed the best fit. Other aspects of archaeological documentation were left completely unaddressed. The end result was the creation of a metadata schema for Isthmia that was more complex and idiosyncratic than the system already in use at the excavation. Finally, the development of a list of approved terminology and formats for these metadata fields has proven to be a challenge in and of itself.
These issues aside, the beta version of ARCS should still be seen as a successful demonstration of the advantages of presenting primary archaeological documentation as digitally augmented evidence. This is seen most clearly in the case of the field notebooks with which this digitization project began. On the one hand, a simple digital image of a notebook page cannot be easily parsed by a computer and thus made machine searchable.
A 1970 notebook from the Isthmia Archives
On the other hand, electronic transcriptions (even when carried out in accordance with TEI standards) do not fully capture the dynamic and organic character of these documents with their photographs, drawings, and handwritten notes, often made by several different individuals over time. Yet, when a notebook page is presented as an image, supplemented by user-generated keywords and hyperlinks to other digital resources, the result is the best of both worlds.
Notebook as it appears in ARCS
The main governing principle throughout the development process has been to electronically update, but not replace the traditional operating procedures common to most archaeological archives. Thus the front page offers the user the opportunity to consult evidence by type (notebooks, maps and plans, cataloged artifacts, reports, etc.) just as these documents are physically arranged at an archive or library.
Front page of ARCS
While users may search for a specific reference at any time, the “resource view” interface also allows for a visual scan of the evidence, just as one might fan through the pages of a book or a series of index cards or drawings.
When a user has identified the information they seek, hyperlinks offer them the chance to follow digitally the cross references that already exist in the original documents. Moreover, just as one might gather together several different types of documents as part of their research, ARCS allows users to create digital collections to which they can return at any time.
All documents and collections have stable URIs so this information can be shared between users as well. Also, because work at an archive often involves conversation with colleagues and consultation with experts, each document on ARCS has an associated discussion forum, where users can ask questions or provide answers.
Finally, because excavations and surveys—even those that are not currently engaged in fieldwork—continue to grow and =generate evidence in both traditional and digital formats, ARCS is equipped with a simple drag and drop upload feature. While they are encouraged to provide as much information as possible about the resource they are creating, at the very least users must define a title and type for the resource. In this way, large batches of information can be uploaded at once and left on the system to be cataloged, tagged, and linked to other data later.
Upload page in ARCS
The version of ARCS currently in use at Isthmia continues to grow. At present the system contains nearly 7,300 unique resources, ranging from digital copies of all notebooks, to notecards representing all inventoried artifacts, to a representative sample of drawings, plans, and type-written reports. Other documents are added each season as they are scanned and processed. As a matter of conservation and preservation alone, this is an important step for the OSU Isthmia Excavations. At the same time though, any of these resources can now be organized into collections and shared with interested researchers in a matter of minutes. Thus requests for information from the Isthmia archives are now beginning to be met by means of an email containing a link to the relevant digital resource. But most significantly, the ARCS system has allowed a smaller project like Isthmia to “go digital” on its own terms (literally and figuratively) and budget without relying on its better-funded peer institutions to share their source code and resources.
In addition, the ARCS project has also produced an unexpected, but no less important, outcome. As a teaching tool, this online resource has been used not only as a way to provide undergraduate students with unprecedented access to primary archaeological documentation but also as a way to encourage them to contribute in a meaningful way to its creation. For the past three years, students enrolled in Prof. Timothy Gregory’s online classical archaeology courses at OSU have been presented with the full body of documentation associated with the excavation of a number of individual trenches at Isthmia, which they then use to generate archaeological reports of their own. For the past five years, students participating in my own study abroad program and courses at MSU have taken a lead role in scanning, processing, uploading and annotating the documents themselves. The process is not always perfect—asking undergraduate students in Greece to perform up to the standards of a professional archivist is at times a real challenge—but in the end, the results are generally reliable. In any case, such activities challenge students not only to make sense of several, potentially conflicting forms of evidence, but also to see the practices and assumptions that underlie the interpretations of the past that are often taken for granted. This is exactly the type of “doing history” that is now held to form the foundation of effective teaching strategies in undergraduate education (see, for example, the discussion in T. Mills Kelly’s recent book on Teaching History in the Digital Age).
Future Directions
While the source code is now freely available on GitHub, there is still much to be done before ARCS can be easily implemented at a wider range of archaeological projects. This is why I am excited that, in collaboration with Ethan Watrall at the MATRIX Center for Digital Humanities and Social Sciences and with the funding of an NEH Digital Implementation Grant, we are now able to continue with this project. Some of the more significant improvements that we have proposed are as follows:
- Because the creation of the underlying ARCS database had represented a stop-gap measure when integration with other data management systems proved too difficult, we plan to implement the KORA Digital Repository and Publishing Platform. This will improve the speed and efficiency of keyword searches as well as the overall organization of the data that is studied through ARCS.
- Inasmuch as it became clear in the early stages of development that ARCS could not (and probably should not) serve as an archival solution, we will be developing an export utility that will properly format the data created and augmented within this system according to the standards required for data storage with services such as the Archaeology Data Service (ADS) and the Digital Archaeological Record (tDAR). This export utility will also allow for the transfer of data generated in ARCS to other software applications such as Microsoft Access and ArcGIS for higher order statistical and geospatial analysis. In addition, because many projects—especially those that have transitioned from traditional analog to digital recording practices—have already created their own databases or other forms of machine-readable information, we will develop an import utility so that this evidence can be organized, augmented and shared through ARCS.
- Because the import and export of different types of data will require a standard format for ease in identification, we will adopt the use of the ArchaeoCore metadata standard, developed at the Fiske Kimball Fine Arts Library at the University of Virginia specifically for use in archaeological contexts. We expect that, in keeping with the work of the Linked Ancient World Data Institute the use of ArchaeoCore will allow data to be shared between archaeological projects without requiring each individual project to redesign its recording system to fit a universal standard.
- Having implemented these changes in the version of ARCS already in use at Isthmia, we will begin to collaborate with William Caraher and Amy Paplexandrou at the Princeton Polis Expedition Medieval Monuments Project, Adam Rabinowitz at the Preserve of Tauric Chersonesos Excavations at Chersonesos, and Kim Shelton at the UC Berkeley Excavations at Nemea in order to test the ability of the ARCS system to adapt to different recording systems for archaeological data. This will involve the creation of an installation wizard that can be used to customize ARCS to suit a particular project’s unique recording system as well as an ontology mapping tool to aid in the sharing of data between projects.
Given my experience in the first phase of this project, it is reasonable to assume that we will encounter some obstacles along the way. Likewise, it would be foolish to think that ARCS will offer a solution to all of the long standing issues associated with the transition to digital techniques for gathering archaeological evidence. For example, we at the OSU Isthmia excavations have maintained some traditional techniques but have adopted certain innovations so that the resulting mix of traditional, handwritten notebooks and artifact catalogues alongside digital images, illustrations and databases requires a concerted effort to coordinate. But at the same time, I think it is reasonable to hope that through the development of ARCS, it may be possible to achieve the elusive goal of sharing archaeological evidence between and among sites in way that nevertheless respects the unique identity of each project’s system for recording and interpreting its evidence. In this way, it may be possible to follow the lead of survey archaeologists in adopting a regional view of the ancient world, but with a degree of detail that is typically the strength of an excavation.