That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by Philip Schreur of Stanford. We were fortunate that staff from several BIBFRAME testers participated: Columbia, Cornell, George Washington University, Princeton, Stanford and University of Washington. They shared their experiences and tips with others who are still monitoring BIBFRAME developments.
Much of the testers’ focus has been on data evaluation and identifying problems or errors in converting MARC records to BIBFRAME using either the BIBFRAME Comparison Service or Transformation Service. Some have started to create BIBFRAME data from scratch using the BIBFRAME Editor. This raised a concern among managers about how much time and staffing was needed to conduct this testing. Several institutions have followed Stanford’s advice and enrolled staff in the Library Juice Academy series to gain competency in XML and RDF- based systems, a good skill set to have for digital library and linked data work, not just for BIBFRAME. Others are taking Zepheira’s Linked Data and BIBFRAME Practical Practitioner Training course. The Music Library Association’s Bibliographic Control Committee has created a BIBFRAME Task Force focusing on how LC’s MARC-to-BIBFRAME converter handles music materials.
Rather than looking at how MARC data looks like in BIBFRAME, people should be thinking about how RDA (Resource Description and Access) works with BIBFRAME. We shouldn’t be too concerned if BIBFRAME doesn’t handle all the MARC fields and subfields, as many are rarely used anyway. See for example Roy Tennant’s “MARC Usage in WorldCat”, which shows the fields and subfields that are actually used in WorldCat, and how they are used, by format. (Data is available by quarters in 2013 and for 1 January 2013 and 1 January 2014, now issued annually.) Caveat: A field/subfield might be used rarely, but is very important when it occurs. For example, a Participant/Performer note (511) is mostly used in visual materials and recordings; for maps, scale is incredibly important. People agreed the focus should be on the most frequently used fields first.
Moving beyond MARC gives libraries an opportunity to identify entities as “things not strings”. RDA was considered “way too stringy” for linked data. The metadata managers mentioned the desire to use various identifiers, including id.loc.gov, FAST, ISNI, ORCID, VIAF and OCLC WorkIDs. Sometimes transcribed data would still be useful, e.g., a place of publication that has changed names. Many still questioned how authority data fits into BIBFRAME (we had a separate discussion earlier this year on Implications of BIBFRAME Authorities.) Core vocabularies need to be maintained and extended in one place so that everyone can take advantage of each other’s work.
Several noted “floundering” due to insufficient information about how the BIBFRAME model was to be applied. In particular, it is not always clear how to differentiate FRBR “works” from “BIBFRAME “works”. There may never be a consensus on what a “work” is between “FRBR and non-FRBR people”. Concentrate instead on identifying the relationships among entities. If you have an English translation linked to a German translation linked to a work originally published in Danish, does it really matter whether you consider the translations separate works or expressions?
Will we still have the concept of “database of record”? Stanford currently has two databases of record, one for the ILS and one for the digital library. A triple store will become the database of record for materials not expressed in MARC or MODS. This raised the question of developing a converter for MODS used by digital collections. Columbia, LC and Stanford have been collaborating on mapping MODS to BIBFRAME. Colorado College has done some sample MODS to BIBFRAME transformations.
How do managers justify the time and effort spent on BIBFRAME testing to administrators and other colleagues? Currently we do not have new services built upon linked data to demonstrate the value of this investment. The use cases developed by the Linked Data for Libraries project offers a vision of what could be done, that can’t be now, in a linked data environment. A user interface is needed to show others what the new data will look like; pulling data from external resources is the most compelling use case.
Tips offered:
- The LC Converter has a steep learning curve; to convert MARC data into BIBFRAME use Terry Reese’s MARCEdit MARCNext Bibframe Testbed–also converts EADs (Encoded Archival Descriptions). See Terry’s blog post introducing the MARCNext toolkit.
- Use Turtle rather than XML to look at records (less verbose).
- Use subfield 0 (authority record control number) when including identifiers in MARC access points (several requested that OCLC start using $0 in WorldCat records).
Karen Smith-Yoshimura, senior program officer, topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements. Karen retired from OCLC November 2020.