The two main reasons why the 72 linked data projects/services described in the survey that consume linked data are: to enhance their own data by consuming linked data from other sources (37) and provide a richer experience for users (35). Other reasons, in descending order: more effective internal metadata management; to experiment with combining different types of data into a single triple store; heard about linked data and wanted to try it out using linked data sources; a wish for greater accuracy and scope in search results; to improve Search Engine Optimization (SEO); and to meet a grant requirement.
The ways projects/services are using linked data sources (in order of the most frequently cited):
- Enrich bibliographic metadata or descriptions
- As a reference source and to
- Harmonize data from multiple sources
- Automate authority control
- Enrich an application
- Dataset discovery
The linked data sources that are used the most:
- id.loc.gov – 30
- DBpedia – 25
- GeoNames – 25
- VIAF – 24
Here’s the alphabetical list of the sources used; those that include uses by FAST, VIAF, WorldCat.org and WorldCat.org Works are asterisked.
|Source||# of Projects||FAST||VIAF||WorldCat.org||WorldCat.org Works|
|British National Bibliography||3||*|
|Canadian Subject Headings||2|
|Dewey Decimal Classification||5||*|
|RDF Book Mashup||1|
|The European Library||5|
The other linked data sources consumed include:
- Bibliothèque nationale de France’s data.bnf.fr, an aggregation of its catalogs and the Galica digital library.
- Deutsche National Bibliothek’s Linked Data Service
- GEMET, GEneral Multilingual Environmental Thesaurus
- Heritage Data’s SENESCHAL (Semantic ENrichment Enabling Sustainability of arCHAeological Links), a set of linked data vocabularies for cultural heritage
- HISCO, History of Work Information System
- Hispana, an aggregation of digital collections of archives, libraries and museums from Spanish digital repositories
- Lexvo for languages
- Logainm.ie, place names database of Ireland
- Nomisma.org, providing URIs for concepts unique to numismatics
- Pleiades Gazetter of Ancient Places, a community-built gazetteer and graph of ancient places in the Greek and Roman world.
- Rådata nå!, Norwegian name authority file, one of the first to be available as linked open data.
- United Nation’s Food and Agriculture Organization’s AGROVAC
Asked whether there were other data sources the respondent wished were available as linked data but isn’t yet, respondents noted:
- More authority files or thesauri (requested by several) or multilingual subject vocabulary
- [U.S.] Federal agencies’ data
- Grant data
- Individual artworks and digital objects from archaeological or museum databases
- Researcher identifiers from smaller data stores
Barriers or challenges encountered in using linked data resources included:
- Size of RDF dumps; volatility of data formats of dumps; lack of availability of dumps; lack of authority control within the dumps; issues with level of specificity in terms of trying to match concepts.
- What is published to the Internet as Linked Data is not always reuseable…Linked data without context is almost useless.
- Many services present like Linked Data aren’t really Linked Data.
- It’s difficult to get other institutions to do their own harmonization between objects and concepts.
- Lots of handcrafting at the moment, not many off the shelf tools that are useful for visualisation.
- Mapping of vocabulary requires a lot of manual work.
- Matching, disambiguating and aligning source data and the linked data resources.
- Not all resources that we would like to use as linked data are represented as URIs. Semantics that can represent library bibliographic data are not established yet.
- It always requires time to understand how the data are structured before using it.
- Disambiguation of terms across different languages is difficult.
- DBpedia resources are not stable. URIs and structure for resource description would change.
- The creation of controlled vocabularies in SKOS seems less intuitive then we’d like.
- Service reliability has been a factor with some resources.
- Unstable endpoints, datasets not being updated.
[Originally posted 2014-09-01, updated 2014-09-04]
Coming next: Linked Data Survey results–Why and what institutions are publishing
Karen Smith-Yoshimura, senior program officer, topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements. Karen retired from OCLC November 2020.