3 Comments on “The Variation and the Damage Done”

  1. In the Consortium of European Research Libraries, where we build up the Heritage of the Printed Book db (6 million records this month), we also find that data conversions and system migrations have caused a lot of havoc… that we aim to put right, in order to ensure the best search and retrieval experience that we can hope for.

  2. Part of the problem: our data entry programs usually don’t validate variable fields, even for subfields like 775 $e which are supposed to have controlled values. Fixed fields like the 00x do have validation, generally.

    Does Connexion allow catalogers to add invalid values in controlled subfields like 775 $e?

    Validation only goes so far, of course – no help with millions of records which may already have bad data, and no way to stop catalogers from accidentally entering the *wrong* valid value… but it should be relatively cheap and easy to add to cataloging programs, and would prevent a lot of new problems from being created.

    1. Andy,
      I checked with my colleagues who said:

      “Connexion validation will only permit valid language codes in 775 $e. If the contents of that subfield is longer or shorter than 3 characters, validation generates an error message. If the code in that subfield is not in the list of MARC language codes, validation generates an error message.”


      “If the question arises as to how we ended up with so many variations in WorldCat given that we do validate that subfield in Connexion, the answer is that validation for batchload is not as strict as that for Connexion and allows addition of records with variations when the errors are minor.”

      I hope this clarifies the situation.

Comments are closed.