I have been doing a lot of editing of worksets datamined from WorldCat to identify the titles of original works (in the original language and script) and all their associated translations, with their respective translators. This work is part of our Multilingual Bibliographic Structure activity. From these results we add work-translation “xR” records to the Virtual International Authority File (VIAF).
Now MARC has lots of fields that, if used, would make this easy. Alas, such is not the case. Some of the challenges I’ve encountered:
- Lots of records for translations lack a language code (tag 041) that would announce they are translations with the use of a first indicator of “1” (item is or includes a translation). Whether a record is a translation needs to be inferred from other information parsed from the record.
- The record for a translation lacks any information about what the original title was. There may be a note like, “Translated from the Chinese.” If the translation is in a language I can’t read, like Croatian or Serbian, I’ll use Google Translate and with luck, there will be a key word that resembles a word from one of the titles written by that author. I can then associate the translation record with the original work.
- In the absence of other information, our algorithm inferred that if a title matched and there were more holdings attached to records in one language than other languages, then that must be the original language of the title. That works well enough most of the time, but I ran across Japanese authors whose works are often translated into Chinese (far more than they are translated into English). It seems OCLC members hold more of the Chinese translations than the Japanese originals, which makes sense since more people can read Chinese than Japanese. But it meant I had to find the original Japanese title and flag the Chinese title as a translation. You can see one of these results from the works/expressions associated with the Japanese mystery writer Higashino Keigo under “Uniform Title Links”.
- When a title has been translated into the same language multiple times, it becomes even more important to identify the translator of each to help people decide which translation they may prefer. Only a subset of records for translations include an added entry for the translator, and those that do, often do not include a relator term ($e) or a relator code ($4, which I prefer since it’s language-neutral). To identify the translator, we parsed the statement of responsibility which meant we had to refer to a long table of all the possible ways “translator” might appear in different languages (a partial list introduces this blog post). The biggest challenge: the records for translations that listed the translator as the author! That’s when I asked my colleagues in Quality Control to also change the WorldCat records, to change the personal name main entry to the author of the original work and move the translator into an added entry.
The relationship of a work (with an author) to translations (with their respective translators) is relatively straight-forward once we’ve identified the correct pieces. I’ve appended below a diagram using one of the Chinese classical novels written in the Ming dynasty as my example. It’s been translated into dozens of languages, multiple times into English, and I include just a subset here. The diagram shows reciprocal links from the work (in blue) to each translation (in red), and from each translation back to the work it is a translation of.
Karen Smith-Yoshimura, senior program officer, topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements. Karen retired from OCLC November 2020.