Questioning authority: creation, use, and distribution of linked data in digital humanities

Lindsay Kistler Mattock and Anu Thapa

Libraries and archives have long been authoritative sources supplying corpora of published texts, archival documents, and bibliographic data for scholarly consumption and production. Digital humanists are now entering this space, transforming traditional resources from library stacks and archival collections into data stores and creating projects that serve as both scholarship and scholarly resource. As the exchange between traditional knowledge source and digital humanities continues, issues around metadata, ontologies, database architecture, and linked data are entering the purview of humanities scholars and library and information science professionals alike. This chapter examines Mapping the Independent Media Community (MIMC), a data-driven project that employs digital humanities methodologies to interrogate the art world surrounding independent film and video production and exhibition in the latter decades of the twentieth century. As a case study, MIMC illustrates the technical issues and process of building data-dependent projects and will address the ethical issues of ontology development and dissemination of data sets as Linked Open Data. Linked Open Data offer an extensible and open platform for sharing data on the open web. Reflecting the utopian ideas of Tim Berners-Lee, the unbounded nature of this data structure promises opportunities to broadly represent information and knowledge from multiple perspectives. However, in practice, the implementation of Linked Data raises questions regarding standardization, knowledge representation, and the limitations of authority.

Mapping the Independent Media Community (MIMC)

MIMC is an ongoing data-driven project concerned with understanding the production, distribution, and exhibition networks of the independent film and video culture in the United States in the 1970s. The growing availability of 16

mm film and video technology during the late 1960s aided in the production of media by amateur, independent, underground, and other non-commercial media creators. By the mid-1970s, those concerned with supporting independent media production began advocating for a national network of organizations that would broadly support the efforts of artists, giving rise to the Media Arts Center Movement. The history of the independent film and video movement, particularly the role of distributors, museums, media arts centres, funding agencies, and other organizations that supported the production of these works, remains underrepresented in film and media scholarship. Challenging conventional notions that the experimental film and media movement was an endeavour of a few individuals, MIMC applies digital humanities methodologies to this history. The project uses archival data to map and visualize the complicated network of artists, institutions, and resources that enabled and supported the exhibition, production, and distribution of independent media in the United States and abroad.

The MIMC database builds from The Film and Video Maker’s Travel Sheet, a monthly publication produced by the Film Section of Pittsburgh’s Carnegie Museum of Art between 1973 and 1987.' Imagined as a social networking tool for media artists, each Travel Sheet contained three sections. The first included contact information for artists and a list of tour dates or their interest in booking dates to screen their work (Table 6.1). This mechanism was designed to advertise screenings, secure additional bookings during tours, or to signal to programmers that artists were interested in traveling with their work.

The second section, “Organizations”, served the opposite purpose, and listed the contact information for the organizations across the country that were willing to host film and video makers for lectures, screenings, and other events. The third section listed new film and video work along with distribution information (Table 6.2). In many cases, these works were self-distributed by makers or by

TABLE 6.1 Examples of entries from the “Tours” section of the May 1977 Film and Video Makers Travel Sheet (full addresses redacted)

Doris Chase

Sian Brakhage

1'oheJ. Carey

Chelsea Hotel New York. NY May 5, 6, London, England May 10, 11, 12, Oslo, Norway May 17, 18, Stockholm, Sweden May 23, 24, Vienna, Austria May 26, 27, Paris, France May 31, June 1, Madrid, Spain (une 6, 7, Lisbon Portugal Above tour for State Dept, with films and videotapes.

Rollinsville, Colorado

Oct. 20, Carnegie Institute, Pittsburgh, PA.

Oct. 21, 22. Pittsburgh Film-Makers, PA.

Willow, NY

Thru Oct. Available for lecture/screenings with videotapes and films. Program of Personal Visions includes: The Birthday Variations, Road Kills, Zadie and the Bar Mitzvah, The True Story of Set Yonr Chickens Free, Giving Birth, and Down in the Dumps.

TABLE 6.2 Examples of entries from the “New Films and Videos” section of the May 1977 Film and Video Makers Travel Sheet (full addresses redacted)

Doris Totten Chase

George Griffin

Dance Ten (1977) 16mm, color, sound, 8 min “A duet of Jonathan Hollander

(choreographer/dancer) and a video synthesized image of one of my kinetic sculptures for dance”


D. Chase Chelsea Hotel New York, NY

Thumbnail Sketches (1977) 16mm, color, sound, 7 min.


Serious Business Co.

1609 Jaynes St.

Berkeley, CA

the organizations listed in the Travel Sheet. This networking tool served artists, curators, and programmers, and anyone with a general interest in independent and avant-garde media.

In its own way, the Travel Sheet functioned as an open-source database, making it an ideal DH project. The entries in the issues were self-generated by those openly contributing their information to the publication. Film section curator Sally Dixon and her successor, Bill Judson, did not vet the entries in the monthly Travel Sheet. Unlike the mechanisms for collecting film and video works within museums and archives, the Travel Sheet was open to anyone with a desire to share their information. The very design of the publication invited this participation, with the cover serving as a detachable form for reporting the details regarding new works, tour dates, or updated address information.

The first issues of the Travel Sheet in late 1973 contain a handful of hand-typed entries on the front and back of a single sheet of 11 x 17 paper. Over these first three issues, the Travel Sheet included 92 different events, 35 organizations, and 59 individuals (makers and programmers). Over the next three years of regular publication, these numbers grew exponentially. The year-end totals for 1976 included 672 events, 240 organizations, and 378 individuals. The physical size of the Travel Sheet continued to grow over the decade and a half of its publication. By its final year, the publication had expanded into a broadside format. It included advertisements, photographs, and stylized cover art, which demonstrated a growth in participation by artists, media arts organizations, and subscribers. At present, the MIMC database contains data from 1973 to 1977 representing 2,012 events, 1,540 organizations, and 2,502 individuals.

Historically, the Travel Sheet served as a networking tool, utilizing the postal networks to connect filmmakers, museum curators, festival programmers, media scholars, and film/video distributors. As a historical resource, it provides unprecedented access to these networks and an understanding of how these actors worked together to legitimize independent film and video as an art form and to build a nation-wide support system. In order to trace and map these networks over time, the MIMC project first transformed and reimagined this data as a digital database.

Building the MIMC database

The first task in the creation of the MIMC database was to pay careful attention to the goals of the Travel Sheet itself. The purpose of the Travel Sheet is clearly stated on the cover of each issue: “to encourage, facilitate wider use of exhibition and lecture tours by film and video makers”. Sally Dixon, the first curator of the Film Section, advocated for standardizing honoraria and travel reimbursement for visiting makers at museums and art galleries. Curators and programmers at other organizations followed suit, matching their support for film and video artists and making tours a viable method of building audiences, and funding the creation of new work. Dixon was also an influential player in the larger Media Arts Center Movement, a grassroots movement aiming to build a network of resource centres that would support film and video production, distribution, exhibition, preservation, and study across the United States (Renan 1973-1974, p. 7). The non-coastal regions of the nation, which were not perceived to be as robust as the media cultures situated in New York City and San Francisco, were of particular concern to the network. Dixon described the Film Section as a model for media arts centers working with sister facility Pittsburgh Filmmakers to support makers in “all areas of their work, from funding films through studying them” (Dixon 1975, p. 28). The movement gained momentum throughout the 1970s, leading to several national conferences, recognition from the National Endowment for the Arts Media Arts Program, and the formation of the National Alliance of Media Arts Centers in 1980.

The Travel Sheet connected members of a vast community of media makers and the organizations that supported them during the formative years of the Media Arts Center Movement. The data, even from the first few years of the Travel Sheet, provide a means for understanding the social networks that emerged between artists and their supporters and to trace the global exchange of this media over time. MIMC aims to historicize the Media Arts Center Movement and to carry its influence into the present, tracing the impact of the movement on the collection and preservation of independent film and video in museums and archival collections. In order to use the tools for a large-scale analysis, the data from the Travel Sheet were first entered into a relational database.

The MIMC database attempted to mirror the Travel Sheet structure, which contains records for individuals, organizations, events, works, distributors, and contact information. The task of converting the publication into digital form appeared to be a simple one-to-one mapping of information into a data table. The data are organized into clear categories—makers, organizations, and works— and reported in standardized forms—first and last name, street address, city, state, country, postal code, and so on. As a representation of the organization of the Travel Sheet, each record was also linked to the individual issue in which it was reported. This allows for the interrogation of the Travel Sheet as a resource, permitting queries by issue. The database also allows for a comprehensive understanding of the data across Travel Sheets, permitting searches that cannot be easily accomplished by reading issues in a traditional manner. In database form, the information can be studied in aggregate and over time. The MIMC database affords new questions regarding the connections between artists and media arts centres, curatorial practices of programmers, and relationships between artists creating and screening work together.

However, the Travel Sheet data are not without its limitations, and presented significant complications for the database. For one, the definition of institutions and their roles in the media arts network presented a challenge for categorizing the institutions in the MIMC database. This difficulty of categorization is perhaps best illustrated by the concept of the media arts centre. Media arts centre is used as a catchall term that includes “institutions that provide facilities for making, studying, exhibiting, preserving, and distributing . . . those film and video works that are independent, as well as all the other film and television recognized as art” (Haller 1979, p. i). The moniker has been used interchangeably with regional media centre, major media centre, media centre, or simply MAC. These terms may include globally recognized cultural heritage institutions, such as the Museum of Modern Art in New York City, as well as regional media centres with a focus on education, like St. Paul, Minnesota’s Film in the Cities. They may support a few hundred artists or even smaller artist collectives and access centres operating at a local level which offer services to a few dozen members. As J. Ronald Green (1982) observes, media arts centres served a wide range of functions but were tailored to serve the region in which they were located. Carnegie’s Film Section identified as a unit within a museum and supported the exhibition and preservation of film and video art through its programming and collecting efforts. The personal networks between curators Dixon and Judson and the wider art world connected artists to funding sources, media arts centres, and others that could facilitate the creation of film and video works. The boundaries between these roles are fluid and present little guidance for marking organizations as collectors, distributors, or production facilities. The term archive, museum, gallery, and theatre are also dynamic, refusing to serve as clear markers of function when attempting to categorize the organizations in the database.

Likewise, the categorization of the works advertised on the Travel Sheets posed yet another issue for the MIMC database. “Independent” was used as an umbrella term to designate noncommercial moving image production in either film or video. The Independent Film Community, a report from the first meeting of media arts centres in 1977, suggested that this retention of control over the creative process was the defining characteristic of film and video makers working independently (Feinstein 1977, p. 9). However, the report further divides independent media into three broad genres: documentary, narrative, and avant-garde. Reduc- tively, documentary and narrative work can be distinguished as non-fiction and fiction, respectively. Avant-garde, however, is defined as an aesthetic mode that may overlap the two genres. Furthermore, this report makes no attempt to address the distinctions between the avant-garde and underground, another commonly used descriptor in this space.

In his attempt to define the underground film, director of California-based media arts centre Pacific Film Archives Sheldon Renan reflected that “definitions are risky”. An underground film is defined similar to independent works as “a film conceived and made essentially by one person and is a personal statement by that person” (1967, p. 17). Like the avant-garde, these works, “dissent radically in form, or in technique, or in content, or perhaps in all three”. They are created on small budgets and screened outside the commercial distribution networks. This is another space in which terminology is fluid; avant-garde, experimental, independent, and underground are all used interchangeably to describe works produced in this mode. Renan further cautions, “[m]any film-makers disavow the word underground, not liking its intimations of seediness and illegality”, and acknowledges the “inadequacy” of the terms, arguing that these designators are “attempts to group together various kinds of films that don’t fit into other categories” (pp. 22-23, emphasis in the original).

The makers, organizations, and works in the MIMC database defy easy categorization. Even seemingly static geopolitical boundaries posed a challenge for the database. As filmmakers travelled through space and time, the MIMC database had to reckon with historical shifts such as the merging of East and West Germany, as well as the split of the former Yugoslavia. Media arts centres and those that worked in these spaces are situated in the liminal spaces between the boundaries of filmmaker, artist, curator, museum, archive, and makerspace. The terms used to designate these spaces and the film and video works that emerged from them are residual categories, slipping into the boundaries of “other”, “none of the above”, “miscellaneous”, and “etc”. (Bowker & Star 2000, p. 149). However, databases, as representational forms, demand categorization.

Categorization is an inherently political act, shaped by the data sources, research questions, the worldview and biases of scholars, and the metadata standards that we employ. During data entry each piece of information must be sorted, classified, and recorded into a discrete field in a data table. This work facilitates the search and retrieval of data, but also demands decisive acts of describing and arranging. While these acts may be determined by the limitations of the available toolset or an easy-at-hand description based on the data points, they may unintentionally reify existing power structures. As such, scholars advocate for a careful examination of standards and tools that are used to implement them. As Bonnie Ruberg et al. (2018, p. 119) write:

A DH scholar may write an object description for many reasons, but first and foremost that description functions as a marker so that the object may be retrieved later. Whether they are encoding a line of text using the Text Encoding Initiative’s markup specification to identify the speech of a character for programmatic manipulation or creating searchable metadata tags for a digital library, a researcher must make choices about how to describe an object within the taxonomical affordances of the available toolset. Such choices, however are far from obvious or mechanical, and they cannot go unexamined.

Scholars engaging data and the tools that are employed for the organization of information have similarly addressed the risks of failing to read against the grain and to interrogate the gaps and silences in the standardized description tools developed for DH practice (Bowker 2005; Bowker & Star 2000; Gitelman 2013; Lampland & Star 2009).

Rather than adopting a standardized ontology, the database for the MIMC Project reflects the structure of the Travel Sheet. Those participating in the Travel Sheet and sharing their data in the publication were part of a community of practice that was subverting standards set by the larger art world. At the time of its publication, film and video were still on the margins of artistic practice. The Travel Sheet disrupted standard curatorial practices that created barriers of access for film and video makers in traditional art spaces. Artists subverted commercial distribution and exhibition mechanisms seeking to maintain artist control over their creations. The Travel Sheet broke the boundaries between curator and artist and offered a new mode of engagement in spaces that were previously closed to these makers. The MIMC database models the Travel Sheet as an information construct mirroring and respecting the mode of representation that filmmakers and media arts organizations chose for themselves. The examples above show that maintaining the integrity of this self-created and shared data was not an easy process, and reveal much about the act of categorization.

Building new networks

As the MIMC project moves beyond the transformation of the Travel Sheet as a static information source to a searchable database, the question of data sharing is at the forefront. In its current form, users can access the database through the Heurist platform or .csv formatted tables. Given this format, Linked Open Data holds the most potential for MIMC data sharing and analysis. Like a database, Linked Data is a structure for containing data. Built on the construct of hyperlinks, Linked Data serves as the architecture of the Semantic Web. Tim Berners-Lee (2009) writes that Linked Data “isn’t just about putting data on the web”, but about creating meaningful connections between data points. Much like the Travel Sheet, Berners-Lee’s vision of Linked Data is about building an information network within which resources can be shared and built upon by a community of users.

Similar to mark-up languages HTML and XML, Linked Data is a platform- independent means of sharing information via the World Wide Web. Linked

Open Data can be generated using the simplest text editing software and translated by web browsers using open standards. Like HTML, Linked Data uses the concept of the URL or Uniform Resource Identifiers. URI are unique identifiers assigned to discrete pieces of information (concepts, events, people, places, etc.) that can be shared and referenced like hyperlinks. According to Berners-Lee (2009), “good” Linked Data should be: (1) openly available on the web, (2) structured data and machine readable, (3) in a non-proprietary format, (4) using open standards, such as RDF, and (5) linked to other data sets available in a similar form. RDF is a simple and open data structure, based on the subject, predicate, object model. Each piece of information in this RDF “triple” is represented by a URL Take, for example, this triple from the Getty Union List of Artist Names (ULAN):

This simple string of three URIs is difficult to make meaning of at first glance, but the links represented here provide us with key pieces of information regarding filmmaker Stan Brakhage. The first URI is the subject. The “agent” defined by the unique number identifier at the end of this string is Stan Brakhage. The second URI (the predicate), tells us about the relationship between the first and second URL In this case, this URI represents the concept of the “preferred nationality”. The third URI or the object, , represents the identifier for the nationality term “American (North American)” in the Getty ontology.

Similar to fields in a database, the unique identifiers are reusable to create relationships between different data points. The URI for Stan Brakhage can be reused in a number of contexts to further describe this artist. Likewise, the URI for nationality can be reused across various artists in the Getty ULAN, linking Brakhage to the other American artists in the catalogue. These references can also be used across data sets, as Berners-Lee intended. The Virtual International Authority File (VIAF) aggregates global data sets, providing a means for linking disparate sources of Linked Data. Brakhage’s URI in VIAF links Getty’s description of Brakhage to representations of the artist from across the web and the globe. Through VIAF, information about Brakhage in the National Library of Brazil links to his Wikipedia page, the Getty ULAN, and other libraries and collecting organizations like the Library of Congress. Using Linked Data, the MIMC database can be enhanced by these other data sources, adding information not available in the Travel Sheet but represented in these other data sets. Likewise, MIMC can provide gap filling information that may not be available in these other data sets.

Data "messiness" and linked data

The example above raises several limitations of Linked Data. The first is that these data are structured. Much like a database, RDF as a digital construct dictates the syntax and form of the data. It also relies on the adoption of metadata schemas and ontologies to describe the information. In the example above, Getty uses a local ontology (Getty Vocabulary Program or GVP) to describe the data in the ULAN, similar to the way that the MIMC database uses bespoke metadata to model the Travel Sheet.2 VIAF uses Friend of a Friend (FOAF), an ontology used to describe relationships between people, and SKOS, a generic schema used to build relationships between concepts.3 The use of these standard schemas ensures that information expressed in Linked Data can be interpreted and shared. The standards can be mapped across one another to match similar concepts, suggesting a shared understanding of the information being described.

Plowever, even generic data standards written for broad interpretation are not neutral, and adopt a particular worldview. Dublin Core, another popular data standard, is based on 15 basic elements intended to be implemented across a variety of contexts. The 15 core elements—contributor, coverage, creator, date, description, format, identifier, language, publisher, relation, rights, source, subject, title, and type—share their origins with standard library descriptions.4 Dublin Core presumes a description of a resource. Each metadata standard or ontology has been designed to model information in a specific way and represents a particular way of knowing.

Feminist and queer critiques of data highlight the places where these standards begin to break, foregrounding the gaps and silences in these systems of representation (Losh & Wernimont 2018; D’Ignazio & Klein 2020). These critiques privilege the messiness of data and resist the binary and hierarchical relationships of the data models offered by ontologies and database systems. RDF and Linked Data are understood as less rigid constructs that allow for data complexities to be expressed. No longer relegated to a single box in a database table, the linkages created between concepts and ideas with Linked Data represent the complex relationships between seemingly discrete pieces of information (Ruberg et al. 2018, p. 121). VIAF provides a clear example. Unlike the Getty ULAN that originated from a project seeking to describe the holdings of the cultural heritage organization, VIAF is an aggregator of data, merging data sets from libraries, archives, and museums around the globe (Baca 2014). The “Alternate Name Forms” for filmmaker Stan Brakhage demonstrates the power in these linkages to pull together all forms of Brakhage’s name, from his name at birth (Robert Sanders), the name given to him by his adopted parents (James Stanley Brakhage), to representations of his name in Cyrillic and Kanji. Linked Data can account for the messiness of data, allowing multiple representations of the same information to be bound together on the web. While data must be structured, adopting clear standards of description so that the information can be interpreted and shared, the linkages created between disparate data sources can represent the intersectionality of concepts, ideas, and knowledge. However, data must be openly available and actively connected if we are to explore the true affordances of Linked Data.

Authority and linked data

While Linked Data and the Semantic web promise to democratize description and representation, VIAF illustrates the second issue that we would like to raise, the question of authority. Libraries and archives use authority records alongside metadata schema and ontologies. Similar to metadata schemas that organize information into discrete boxes, authority control is a mechanism that ensures that data can be reliably queried and found. Much like metadata standards and ontologies, authority control creates consistency across data sets to ensure the quality of search results (Taylor & Joudrey 2016, p. 249). Preferred terms, like “preferred nationality” in the above example from the Getty, are indicators of the use of authority. The preference suggested here is not the preference of the subject of the description, but of the Getty. The Getty ULAN also names the “preferred” identifier for Brakhage’s role in the art world as “artist”, although filmmaker is listed as a secondary term.

Databases, including the one used for MIMC, use a form of authority control locally, selecting an authoritative form of a discrete piece of information as the representative form. Such selective processes privilege one value over its variants. For example, over the history of the Travel Sheet, artists did not always consistently report their names when sharing their information. Artists may list a middle initial in some issues, an honorific in others, or may appear with a completely different name due to typographical or layout errors in the Travel Sheet. In the examples shown in Tables 6.1 and 6.2, Doris Chase appears as “Doris Chase” in the Tours section, but “Doris Totten Chase” in the “New Film and Video” section. The database records Doris Totten Chase as the authoritative record and Doris Chase as a variant form. Organizations’ names exhibit a similar pattern of variation. Pittsburgh Filmmakers appears as Pittsburgh Film-makers, Pittsburgh Filmmakers, and Pittsburgh Filmmakers, Inc. When entering data into the Travel Sheet, these variations are normalized and the entry is listed with the authoritative version of the name, “Pittsburgh Filmmakers”. This uniformity of data enhances discoverability and disambiguates between similar records (curator Jane Smith of New York vs. videomaker Jane Smith of Los Angeles). Similarly, libraries and other collecting institutions standardize the descriptive terminology used to catalogue records and the variant forms of names for persons, families, organizations, institutions, and works (Taylor & Joudrey 2016, p. 250).

The process of data standardization for linking data sets further highlights the fact that archives, libraries, and other collecting institutions continue to remain the gatekeepers to knowledge, often to the detriment of small, locally sourced data sets. VIAF, for example, aggregates multiple global data sets but sets very clear boundaries on participation. As a service provided by OCLC, the Online Computer Library Center, those contributing to the data set must be vetted by the membership of VIAF. To be a contributing member, one must meet the following criteria:

  • • a National Library, Archives, or Museum (LAM),
  • • large or trans-national LAM cooperative,
  • • or another “widely known and respected cultural heritage institution that hosts valuable authority data”.3

VIAF does approve “Other Data Providers” based on the size, scope, and source of the data set, but contributions must be reviewed and authorized before they are introduced into the catalogue. At present, there are few examples of scholarly projects contributing to the VIAF authority records. These experimental efforts have focused on diversifying the language representation in the records, rather than adding new entries to the authority file (Smith-Yoshimura 2013). Digital humanities projects like MIMC may use authorities like VIAF but do not meet the current qualifications to contribute. The process is similar for other traditional sources of authority, including the Getty ULAN. The Getty will accept data from the Carnegie Museum of Art, source of the Travel Sheet, as an archival institution, but will not consider submissions from an un-affiliated scholar. Authority is conferred to the institution and not the archival holdings.

Such restrictions within Linked Data re-establish the authority of traditional repositories despite the fact that smaller projects, like the MIMC, may hold gap- filling information. Testing the utility of these authorities for the MIMC project, the authors cross-referenced a subset of names from the February 1977 issue of the Travel Sheet with four authority files: VIAF, Getty ULAN, Library of Congress Names Authority, and Wikidata. The 1977 issue contains 191 unique names representing film and video makers, curators, programmers, distributors, and other individuals associated with the independent and avant-garde media arts. We found only 5% or nine records across all four authority files, while more than half (55%) were not represented across data sets. Figure 6.1 is shaded to illustrate the disparity between the available Linked Data and the MIMC database. These ratios suggest that only 1,127 of the 2,502 names in the MIMC database would be found in these major authority records. Here, it is important to remember that the current MIMC database contains data from the first 5 years of the Travel Sheet. Participation in the Travel Sheet continued to increase over the next decade of its publication, suggesting that the number of records in the MIMC database will continue to grow, widening this gap between those with authority data and those without.

Authority records are generated by libraries and collecting institutions based on the principle of literary warrant, that is, authority terms and subject headings are determined by the resources catalogued in collections (Glushko 2016, p. 285). The individuals from the MIMC database appear in the authority records because they have created works collected and cataloged by libraries, archives,

MIMC records represented across authority files

FIGURE 6.1 MIMC records represented across authority files.

Source: VIAF, Getty, LoC Authorities, and Wikidata

and museums. The artists listed in the Getty ULAN (40 of the 191 records tested or 21%) represent those with film or video works that are considered “art” and accessioned into museum collections/’ Those included in the Library of Congress authority records (30% or 57 of 191) have authored publications associated with their names. Filmmaker George Semsel, for example, represents this subset of individuals from the MIMC database. We found a record for Semsel in VIAF and Library of Congress authority records, but his name does not appear in the Getty ULAN. A closer look at his authority files reveals that his records are linked to his work as a film scholar who authored several monographs, but not as a filmmaker. Semsel’s works are preserved at media arts centre Pittsburgh Filmmakers, an organization not recognized as an archive or museum that does not have a catalogue of its holdings, and therefore cannot contribute to the authority records. Semsel’s contributions to the Travel Sheet provide insight into his work as a filmmaker, while the authoritative sources of Linked Data represent his scholarship.

Authority in traditional sources is established by who is published, what is written about, how it is written about, what is considered “art”, and what is collected by libraries, museums, and archives. Reflecting on Derrida’s Archive Fever, Geoffrey Bowker (2005, p. 18) observes, “What is stored in the archive is not facts, but disaggregated classifications that can at will be reassembled to take the form of facts about the world”. These resources frame how the art world is understood and who will be remembered. The Travel Sheet (and therefore the MIMC database) offers new insights, understandings, and a means for beginning to fill the gaps found in these authoritative sources.

Conclusions: the limitations of authority

Mirroring the utopian ideals of Linked Data, the Travel Sheet was grounded on the principles of open participation, fostering discovery, and working outside of the boundaries of the established authorities. The data contained within were reported in a form that could be easily interpreted by others and shared through the communications networks that connected those most interested in using the information. Further, the Travel Sheet subverted the authority of gatekeepers to the art world, directly connecting makers with the organizations that would support their work.

As the MIMC case study illustrates, a linked semantic web that aims for a democratic system of knowledge generation and consumption demands technical expertise, resources, and infrastructure that are not accessible to all. As DH scholars, we must consider the power structures inherent in data generation, management, and curation. The very act of creating a database raises issues around the ethics of categorization. These issues crystallize when self-generated data points must be modified to suit the affordances and limitations of mechanical tools. Modifying self-reported data points such as “filmmaker” to “creator”, in the case of the MIMC database, might not be so egregious, but the stakes are much higher when representing fluid and dynamic concepts such as gender, sexuality, and racial identity information as static data points. The danger of the loss of messy information in data sets highlights the central conflict in the term “digital humanities”. If humanities, as a field, is concerned with the nuances of human experiences and identities, then the fundamental nature of digital technology, as a binary system of ones and zeros, complicates the integration of the digital with the human. Digital humanities must resist the urge to quantify or classify myriad human experiences into one.

Even when databases seek to retain the integrity of self-reported data sources, linking data presents yet another hurdle in data dissemination. As this chapter demonstrates, inasmuch as Linked Data offers reasonable solutions to the issues of data interoperability and reuse, it also operates by deferring to established authority repositories such as museums, archives, and libraries. In so doing, the process can reify the same kind of gatekeeping and power dynamics present within the structures of archives and museums. The complicated processes of linking create new kinds of division in contemporary digital humanities practices. Contrary to the goal of knowledge democratization, there exists a stratification based on one’s access to data creation and management. Even if data generation has been relatively easy, linking data so as to ensure its use is a rather complex and resource-draining endeavour that only big knowledge-based institutions, such as universities, archives, and museums, are able to undertake. As practitioners in the field, it is up to us to navigate these challenges and strategize new methods through which the utopian promise of a democratic system of knowledge can inch closer towards reality.


  • 1 Digitized issues of the Travel Sheet are available at
  • 2 Getty Vocabulary Ontology. [Viewed 28 April 2019]. Available from: http://vocab.getty. edu/ontology
  • 3 FOAF Vocabulary Specification, 14 January 2014. [Viewed 28 April 2019]. Available from:; SKOS Simple Knowledge Organization System Namespace Document, 18 August 2009. [Viewed 28 April 2019]. Available from: www. w3 .org/2009/08/ skos-reference/skos.html#
  • 4 Dublin Core Metadata Initiative, 16 June 2012. [Viewed 28 April 2019]. Available from:
  • 5 OCLC, VIAF Admission Criteria. [Viewed 28 April 2019]. Available from: wwwoclc. org/content/dam/oclc/viaf/VIAF%20Admission%20Criteria.pdf
  • 6 Individual test searches for titles of film video works also suggest that the MIMC database holds a broader representation of artworks.


Baca, M., (2014). Fear of authority? Authority control and thesaurus building for art and material culture information. Cataloging & Classification Quarterly. 38(3—1), 143-151.

Berners-Lee, T, (2009). Linked data. Design Issues [online]. 18 June. [Viewed 17 April 2019]. Available from:

Bowker, G. C., (2005). Memory practices in the sciences. Cambridge, MA: MIT Press.

Bowker, G. C. and Star, S. L., (2000), Sorting things out: Classification and its consequences. Cambridge, MA: MIT Press.

D’Ignazio, C. and Klein, L., (2020). Data feminism. Cambridge, MA: MIT Press.

Dixon, S., (1975). The expanding film section. Carnegie Magazine. January, p. 28.

Feinstein, P., ed., (1977). The independent film community: A report on the status of independent film in the United States. New York: Committee on Film and Television Resources and Services.

Gitelman, L., ed., (2013). ‘Raw data’ is an oxymoron. Cambridge: MIT Press.

Glushko, R., (2016). The discipline of organizing: Professional edition. 4th Edition. Sebastopol, CA: O’Reilly Media.

Green, [. R., (1982). Film and not-for-profit media institutions. In: S. Thomas, ed. Film/culture: Explorations of cinema in its social context. Metuchen, N[: Scarecrow Press, pp. 37-59.

Haller, R., (1979). The 1979 national conference of media arts centers report. New York, NY: Foundation of Independent Video and Film.

Lampland, M. and Star, S. L., eds., (2009). Standards and their stories: How quantifying, clas- sifyirtg, and formalizing practices shape everyday life. Ithaca, NY: Cornell University Press. Losh, E. and Wernimont, J., eds., (2018). Bodies of information: Intersectional feminism and digital humanities. Minneapolis, MN: University of Minnesota Press.

Renan, S. J., (1967). An introduction to the American underground film. New York: E. P. Dutton & Co.

Renan, S. J., (1973-1974). The concept of regional film centers. Sightlines. 7(3), 7-9. Ruberg, B., Boyd.J. and Howe, J., (2018). Toward a queer digital humanities. In: E. Losh and J. Wernimont, eds. Bodies of information: Intersectional feminism and digital humanities. Minneapolis, MN: University of Minnesota Press, pp. 108-128.

Smith-Yoshimura, K., (2013). Irreconcilable differences? Name authority control & humanities scholarship. Hanging Together, OCLC Research Blog [online], 27 March. [Viewed 28 April 2019]. Available from: Taylor, A. G. and Joudrey, D. N., eds., (2016). The organization of information. 4th Revised Edition. Santa Barbara, CA: Libraries Unlimited.

< Prev   CONTENTS   Source   Next >