Preserving, cataloging, finding, and citing games

About game scholarship tools

When I was still a graduate student on the east coast, I was invited to participate in a workshop at the UC Humanities Research Institute. David Goldberg, UCHRI's director, asked me to imagine a technical innovation that would benefit the kind of humanities work I do. My response was to describe a framework that would make citation of software "states" as easy as current citation of a book's page numbers or a film's time code.

It was not until my time as professor that I actually had the opportunity to begin to pursue this vision. And by then my views had been significantly expanded, in large part through work with a core team who collaborated with me on two projects, one funded by the NEH and the other by IMLS.

For both I worked closely with EIS PhD student Eric Kaltman, UCSC librarian Christy Caldwell, and Stanford librarian Henry Lowood. There was an expanded UCSC/Stanford team for the GAMECIP project, including Marcia Barrett, Glynn Edwards, Greta de Groat, Peter Chan, and greater involvement from my EIS co-director, Michael Mateas. Together, these projects have done foundational work on the entire "pipeline" necessary for future historical and interpretive work on video games by scholars, game creators, and members of the interested public.

The first stage in this pipeline is the preservation of archival material. For future research, this must include more than simply the executable code. Our NEH-funded work focused on building a model of the different sorts of potentially-archivable material produced during the game development process, the potential significance of each, and the challenges each presents for current archival methods. The results were published as a white paper for the NEH and then condensed as a conference paper which was nominated for a best paper award at Foundations of Digital Games.

The next stage in the pipeline is bringing material into collections, often through cataloging. If you have ever tried to look up a game in a library catalog, you probably know that game cataloging is a mess. Important parts of titles are often removed, necessary technical information (like the game's platform) is often missing, a strange set of genres is sometimes employed, and so on. The problem here is twofold. First, catalogers don't understand what is important and where in the record it should be placed. Second, catalogers have lacked a set of "controlled vocabularies" for key information, especially technical information like platform names and media types. An important focus in the GAMECIP project was to address these deficits. We developed and published a core metadata schema for game cataloging which was adopted by the major group of catalogers in the area (OLAC) in June 2015. We also developed controlled vocabularies for game platforms and media formats, which led to adoption by the Library of Congress in November 2015. A proposal for changes to the MARC record format, to allow our vocabularies to be used as specified in our cataloging recommendations, was adopted by that standard's advisory committee (chaired by representatives of the Library of Congress, Library and Archives Canada, British Library, and Deutsche Nationalbibliothek) in January 2016, unanimously, and then by its steering committee in March of that year.

Once games are in collections, they often remain quite difficult to find. In fact, this "discovery" problem plagues all games more than a few years old. The GAMECIP project also took on this challenge, and added EIS PhD student James Ryan in order to do so. Our first effort was to create a corpus of the nearly 12,000 games that had non-stub entries in Wikipedia at that time. From this we first created a tool called GameNet, which provides a summary of each game and a list of the most-related and least-related games (created by doing latent semantic analysis on the corpus). Next we created a tool called GameSage, which takes a free-text description of a game (perhaps a game that isn't fully remembered, or a game idea being considered for development) and shows GameNet's most- and least-related games for it. We tried these tools out with students in an introductory game design course and an introductory game history course with positive results. We also made them available online, resulting in press coverage and use by the general public. These were both effective search tools, but not great tools for browsing — so we also explored interactive visualization. A number of our visualizations have been shown as conference demos but the most impressive is another tool that we have released publicly: GameSpace. This is a reduction of our model of game relatedness down to three dimensions, presented as a galaxy of stars (games), which users can navigate using familiar game controls. We see this as an example of a "playful tool," and have published a technical report and received press coverage (including an article in New Scientist).

The last stages of the pipeline are actual investigation of games and citation of what is found. To explore this, the GAMECIP project added EIS PhD student Joseph C Osborn to the team. Together we developed GISST: the Game and Interactive Software Scholarship Toolkit. This project began with working to define which aspects games (and software more generally) scholars might most wish to explore and cite in manners ill-supported by current tools. We identified (1) objective reference to game objects, (2) enacted reference to game performances, and (3) internal reference to sub-structures within both. Our software enables the creation, collection, and sharing of game reference data, gameplay video, and emulated system states. Our web-based emulation approach also allows us to embed system states into web-based academic publications, enabling the new type of scholarly citation that I described at that UCHRI meeting years before. The publication of the first journal article to embed a version of GISST was in Digital Humanities Quarterly. A final publication, in Game Studies, argues for our project's approach to determining what to include in game citations, based on the nature of the argument being made (as opposed to then-current guidelines for Game Studies, which were the same for all types of arguments — and driven by an attempt to bracket many technical aspects of games).

With that I believe the GAMECIP project achieved my initial goal: to research, propose, and demonstrate transformations to the entire pipeline of game preservation, cataloging, discovery, investigation, and citation. I'm happy to say that further research by Kaltman, Osborn, and EIS alum Adam Smith is continuing to improve on what we did in EIS.

Noah Wardrip-Fruin