Skip to content

Hanging Together

the OCLC Research blog

  • Home
  • About
  • Subscribe to Hanging Together
Main Menu
Metadata / Renovating Descriptive Practice

Transcription vs. Transliteration

February 25, 2015 - by Karen Smith-Yoshimura

War and Peace for HT blog 2015-02

This post is co-authored by Karen Coombs, OCLC Senior Product Analyst

Our virtual dialog began with Karen C’s tweet:

Tweet for HT blog 2015-02

 

 

 

But Karen S-Y couldn’t respond in just the 140 characters Twitter allows. Instead she sent an email:

 Transcribing transliteration” from a piece is almost an oxymoron. It rarely occurs. Transliteration by definition is converting one writing system (e.g., Chinese characters) into another writing system (e.g, Latin-script characters, or romanization).  Catalogers in Anglo-American countries will transliterate non-Latin titles using ALA/LC romanization for the writing system on the piece; other countries may use other transliteration schemes.

You will generally find transliterated titles whenever there is a non-Latin title (in MARC, stored in the 880-245 field). But OCLC doesn’t support all scripts, and not everyone takes advantage of the scripts OCLC does support – e.g., we support Cyrillic but only 10% of all Russian-language titles in WorldCat have the Cyrillic that appears on the piece.  The ALA/LC romanization for Cyrillic is distinctly different from the ISO standard used by almost everyone else, so where we rely only on the romanized strings, the same title in Cyrillic may be represented by different clusters using different transliteration schemes. (In the graphic that precedes this entry, two romanizations are shown for the Russian “War and Peace”.)

In general, it’s better to rely on the non-Latin script title if we have it than any transliteration that may be also be in the record. The non-Latin script titles will be transcribed from the piece and any transliteration will be supplied by a cataloger, which may or may not match the transliteration supplied by another cataloger…

Karen C. wrote back: 

I think you answered the question the user was asking when you said that “The ALA/LC romanization for Cyrillic is distinctly different from the ISO standard used by almost everyone else, so where we rely only on the romanized strings, the same title (with the same Cyrillic string) will be represented by different clusters using different transliteration schemes.”

The user asked, “Your API returns texts in Russian in a strange transliteration format. As I see, it’s not ISO-9. For example, this text: “Oni vernulis? na rodnui?u? planetu, gde za vremi?a? mezhzve?znogo pole?ta proshlo bol?she sta let i vse? tak izmenilos?, chto Zemli?a? stala chuzhoi? im”. Please, can you tell me, how to convert this format into correct Cyrillic?”

At least I understand the why now.

Karen S-Y commented:

It also happens to be the case where there is almost a one-to-one correspondence between romanized Russian and its Cyrillic counterpart. That is why most libraries didn’t bother adding the Cyrillic. Since the system requires that if you put in non-Latin script you also enter the romanization, it represents “double work.”

This prompted Karen C. to ask:

Does the MARC record have any way to tell you if a title was romanized?

Karen S-Y answered:

By inference, yes.

If the language code is for a language not written in Latin characters, and there is no 880 in the MARC record, then the non-English information in the record is by definition all romanized (non-English information if the language of cataloging is English).

The following table shows the percentages of WorldCat records describing materials in the top 15 languages that are written in non-Latin scripts that WorldCat supports represented by the original script (transcribed from the piece) and by transliteration only (supplied by the cataloger). Most records for languages written in Cyrillic and Indic scripts contain transliterations only.

 Top 15 languages in WorldCat written in non-Latin character sets

Top 15 Languages in WC Written in Non-Latin Scripts Table 2015-02

Karen Smith-Yoshimura

Karen Smith-Yoshimura, senior program officer,  topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements. Karen retired from OCLC November 2020.

OCLC Research

Hanging Together is the blog of OCLC Research. Learn more about OCLC Research on our website.

Stay Connected

Sign up to have Hanging Together updates sent directly to your inbox and to keep up with the latest news about OCLC Research.

Links

  • Next – OCLC Blog
  • OCLC Research
  • OCLC Research Library Partnership
  • WebJunction

Categories

  • Archives and Special Collections (228)
  • Artificial Intelligence (AI) (19)
  • Born-Digital Special Collections (15)
  • Collaboration (30)
  • Collections (3)
  • Collective Collections (124)
  • Data Science (16)
  • Digital Preservation (70)
  • Digitization (25)
  • Equity, Diversity, Inclusion (EDI) (99)
  • Evolving Scholarly Record (12)
  • Higher Education Future (9)
  • Identifiers (44)
  • Infrastructure and Standards Support (109)
  • Libraries (103)
  • Libraries Archives and Museums (136)
  • Libraries in the Enterprise (3)
  • Library Futures (11)
  • Library Management (14)
  • Linked Data (60)
  • Measurement and Behaviors (44)
  • Metadata (126)
  • Metadata Managers (8)
  • Miscellaneous (181)
  • Modeling new services (113)
  • MOOCs (7)
  • Museums (58)
  • New Model Library (2)
  • Open Access (21)
  • Renovating Descriptive Practice (131)
  • Research Data Management (31)
  • Research Information Management (52)
  • Research Library Partnership (227)
  • Research support (69)
  • Resource Sharing (11)
  • Searching (38)
  • SHARES (11)
  • Social Interoperability (35)
  • Supporting Scholarship (69)
  • Systemwide Organization (42)
  • User Behavior Studies and Synthesis (18)
  • Visual Resources (17)
  • Web Archiving (14)
  • WebJunction (8)
  • Wikimedia (43)

Share Buttons

  • Bluesky
  • Facebook
  • Linkedin
  • Twitter
  • Outlook
  • Gmail
  • Yahoo Mail
  • Email

Recent Comments

  • Isabel Quintana on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kem Lang on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kelly Sattler on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Renee Mercer on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Trenton James on Navigating the future of special collections metadata by using insights from the past 

Categories

Archives

More about OCLC Research

Visit our web site.

Recent Posts

  • World of cats meets real cat: My thoughts on the ultimate library quilt
  • Scaling de-duplication in WorldCat: Balancing AI innovation with cataloging care
  • Artificial intelligence to support metadata workflows: an OCLC RLP working group
  • Timeless lessons on collaboration from OCLC Research
  • Reimagine Descriptive Workflows in the UK and Ireland: An OCLC RLP community-informed discussion

Policy Links

  • Code of Conduct
  • Terms of Use
  • Privacy Statement

Admin.

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Cookies used on Hanging Together
© 2024 OCLC || ISSN 2771-4802