Science uses the art of observation to unearth truth. Sometimes the observation is minutely focused on a small constituent of a much larger ecosystem. By doing this, it can be possible to detect larger truths from such minutely focused observation. This brings me to my latest metadata investigation, which is about as minutely focused within the library metadata world as it is possible to be.
I decided to look at the life of a single MARC subfield, in this case the lowly 034 $2. The 034 field is “Coded Cartographic Mathematical Data”. The 034 field was proposed and adopted in 2006. The $2 subfield is where one can record the source of the data in the 034. Values were to come from a specified list of potential values.
From my “MARC Usage in WorldCat” work, I already knew that as of last January there were about 2.4 million records with an 034 field. I also knew that the $2 subfield of the 034 only appeared 1,976 times. Of course a year had passed so that figure was likely low.
So the first thing I did was to grab all of the 034 $2 subfields and count how many times each source code had been used. Since the point of my exercise was not to show errors, I combined entries with typos with what they should have been and only counted as “errors” entries that were clearly in the wrong place in the field:
3868 | bound |
2539 | gooearth |
1069 | geoapn |
215 | geonet |
157 | geonames |
129 | pnosa2011 |
46 | other |
26 | gnis |
26 | ERRORS |
17 | cga |
5 | local |
3 | gnrnsw |
3 | aadcg |
1 | wikiped |
1 | gettytgn |
1 | geoapn geonames |
I then wanted to find out who was using this subfield, so I ran a job to extract the 040 $a, the “original cataloging agency” and totaled the occurrences. It turns out the vast majority come from five institutions:
2471 National Library of Israel (J9U)
1632 Libraries Australia (AU@)
1076 British Library (UKMGB)
885 Pennsylvania State University (UPM)
799 Cambridge University (UkCU)
Then it drops off rather precipitously from there:
213 Agency for the Legal Deposit Libraries (Scotland) (StEdALDL)
206 New York Public (NYP)
117 Commonwealth Libraries, Bureau of State Library, Pennsylvania (PHA)
101 Yale University, Beinecke Rare Book and Manuscript Library (CtY-BR)
Curious about how the main user of this element was using it, I contacted the National Library of Israel. They were kind enough to reply to my odd query:
We have added geographic coordinates to records that describe ketubot, Jewish marriage contracts. The contracts almost always include the geographic location where the wedding takes place.
Using, google earth ($2 gooearth) , we added the coordinates with the intention of enabling the display of a google map in this website.
I don’t believe that the site is fully functional as to their intended goal, but you can at least start to get an idea as to how this data is going to be used. So even a lowly subfield can have higher aspirations for impact than may seem warranted at first.
Roy Tennant works on projects related to improving the technological infrastructure of libraries, museums, and archives.
Susan, thanks for pointing this out! I’ve suggested that we change our documentation to mimic LC’s, which points directly to that list of codes.
I think one reason this subfield isn’t being commonly used is an issue of workflow or confusion in where to get the codes. The 034 help page, http://www.oclc.org/bibformats/en/0xx/034.html, currently points to a list of relator codes rather than the list of cartographic source codes hyperlinked in your article, http://www.loc.gov/standards/sourcelist/cartographic-data.html.
Thanks for the interesting article!