Mark Davis blogged today on the Official Google Blog that Unicode passed a new milestone last December: For the first time Unicode became the most frequent encoding found on Web pages “overtaking both ASCII and Western European encodings—and by coincidence, within 10 days of each other.” The accompanying graph shows the speed of the Unicode uptake.
“You can see a long-term decline in pages encoded in ASCII (unaccented letters A through Z). More recently, there’s been a significant drop in the use of encodings covering only Western European letters (ASCII and a few accented letters like Ä, Ç, and Ø). We’re seeing similar declines in other language-specific encodings. Unicode, on the other hand, is showing a sharp increase in usage.”
RLG was a founding member of the Unicode Consortium, and I have had the pleasure of seeing its uptake in the library community. With the widespread adoption of Unicode, I’ve seen far fewer instances of bibliographic records or Web pages in Chinese and Japanese scripts that are garbled because of incompatible encoding. It’s gratifying to see all the investment in making the world’s languages accessible to all paying off. So easy to take it for granted…
Karen Smith-Yoshimura, senior program officer, works on topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements.