That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by Stephen Hearn of the University of Minnesota. Can we still insist on using the authorized access point as the primary identifier? It is scary to imagine that we have to build authorized access points for titles in a “work” focused environment. Other communities are putting together separate pieces of information to help select the correct name or title. Dates are not always the most informative choice for the user. Libraries receive an influx of records where we have no control over the authorized form of the name anyway. Other environments make use of identifiers. Wikipedia, IMDb and MusicBrainz differentiate entities and then prompt you for more information. We have an opportunity to work with a larger community.
Do we still need an authorized access point as a “primary identifier”? Let’s distinguish identifiers from their associated labels. Access points rely on unique text strings to distinguish them from other access points. A unique identifier could be associated with an aggregate of other attributes that would enable users to distinguish one entity from another. Ideally, we could take advantage of the identifiers and attributes from other, non-library sources. Wikidata, for example, aggregates a variety of identifiers as well as labels in different languages, as pictured above.
The library community has started to move towards the use of identifiers by adding identifiers in the $0 of heading fields. OCLC algorithmically added FAST (Faceted Application of Subject Terminology) headings, with their identifiers, to all WorldCat records that had an LC subject heading. Other communities have started including VIAF (Virtual International Authority File) cluster identifiers to their entity descriptions. Providing contextual information is more important than providing one unique label. Labels could differ depending on communities—such as various spellings of names and terms, different languages and writing systems, and different disciplines—without requiring that one form be preferred over another.
Catalogers have long added value by supplying information about relationships. RDA attributes have spurred libraries to move toward contextualization. We now have ways of making that information more understandable to users. As those capabilities continue to evolve, the need for unique strings could diminish.
NACO is a valuable program but not everyone is able to contribute. Even in institutions that are NACO contributors, only staff who have received the requisite training can create LC/NACO authority records. The volume of names without authority control is increasing, especially as academic institutions commit to providing a comprehensive overview of their researchers’ output, often stored in separate local databases or scholar profile systems. NACO-level work isn’t sustainable beyond MARC records.
Could Wikidata be an alternative to contribute information about entities? Adding names or information about entities into Wikidata could be a very low barrier way to for non-NACO staff to supplement NACO contributions. For example, the University of Miami’s RAMP (Remixing Archival Metadata Project) generates Wikipedia pages out of archival descriptions (discussed in the 2014 OCLC Research Webinar, Beyond EAD). Encouraging contributions to Wikidata could also tap the expertise within our communities.
Envisioning the future: The authorized access point was designed for a closed, MARC-based environment. Its time has come and gone. We already see examples of “identifier hubs” that aggregate multiple identifiers referring to the same entity. More work is needed to establish “same as” relationships among different identifiers and to add identifiers to our large legacy databases that can point to one or more of these “identifier hubs.” We need technology that can integrate the metadata from all the sources that generated the identifiers, filtered according to the context. We could start by focusing on identifiers rather than labels as a means to concatenate result sets. Greater functionality for identifiers would drive the value proposition for datasets that merge them and provide correlations among the various sources.