Archive for February, 2009

‘Relentless Apocalypse?’ Manuscripts, Digital Books and Rights

Thursday, February 26th, 2009 by Jennifer

Perhaps obliquely related to Merrilee’s recent posts and Ricky’s report, the current Harpers’ carries an article by Gideon Lewis-Kraus: “The Last Book Party: Publishing Drinks to a Life After Death.” That end-of-the-world book party was at the Frankfurt Book Fair last October.

The problem with publishing is the relentlessness of the apocalypse. Since the seventeenth century, catastrophe has desolated the book industry on a generational level: a censorious Catholic court, Jewish moneylenders, the War of the Spanish Succession, the Gregorian calendar (which led, in its maiden year, to a disastrous misscheduling of the Frankfurt Book Fair), the railroad, the post office.

Lewis-Kraus points out that once writers surrender their manuscripts, the “professionals” feel the “books are now, rightfully, theirs – theirs to talk about, theirs to own, and most importantly, theirs to sell…” This seems to me a seventeenth-century understanding of ‘copy’ embedded in the English origins of ‘copyright’ – manuscripts were owned by printers, not authors. This model for publishing ‘copy’ now apparently governs some digitization of unique materials.

Writers may not see it this way. At the 2008 Frankfurt Book Fair’s opening press conference, Paul Coelho declared himself to be a “pirate of myself.” “He has set up a website, he explains, where you can download his books for free; he says that the goal for a writer is to get his books into the hands of as many readers as possible.” My personal goal as a librarian and archivist is to get my collections into the hands of as many people as possible.

Might Google and publishers be thinking the “books are now, rightfully, theirs…”? Michael Pietsch, publisher of Little Brown: “This is all about the commodification of books. It’s a writer’s version of hell.” If this is a writer’s version of hell, is it the library’s version of purgatory? At least the writers are ostensibly at the table.

Mostly, though, in his article Lewis-Kraus paints the Frankfurt Book Fair as if it were a party filmed by Altmann. It’s a fun read and most folks have to read it on paper. The link is here, but the OCLC Library doesn’t have a subscription, so I had to buy the magazine. Nice tactic, or are the publishers of Harpers’ shooting themselves in the foot?

I googled the Frankfurt Book Fair. They report results of their 2008 survey (over 1,000 industry professionals from over 30 countries responded), including this tidbit:

Who is really in charge? When asked who was driving the move towards digitisation in the book industry, only seven per cent felt that publishers were leading the way:
• 22 per cent said that consumers were pushing the move towards digitisation
• online retailers like Amazon (21 per cent), Google (20 per cent), and the telecommunications sector (13 per cent) were not far behind
• only two per cent felt that authors were driving this aspect of the industry – and governments lagged even further behind with only one per cent

Libraries aren’t on this list. The top two concerns the respondents reported would be discussed at the fair last fall were copyright (28 per cent) and digital rights management (22 per cent).

Large-scale digitization of special collections: legal and ethical issues (part 2)

Tuesday, February 24th, 2009 by Merrilee

[In my last posting, I failed to note that Aprille Cooke McKay's excellent presentation on third party privacy can be found in the symposium Wiki]

The symposium “Legal and Ethical Implications of Large-Scale Digitization of Manuscript Collections” had two illustrative case studies, one on third-party privacy, and one on evaluating copyright status of works in archival collections.

The first was a presented by Nancy Kaiser and Matt Turi (and the background materials can be found on the symposium Wiki). Nancy and Matt briefed us on three different collections, all containing materials that are in some way sensitive, and lead a discussion of whether scanning these materials would be advisable. It’s both unfortunate and appropriate that the materials presented are not available on the Wiki, because each collection does present particular challenges — if you saw the materials you would know what I was talking about. One collection contains public health case studies — it is filled with health information and other information that could be considered sensitive. Another collection contains references to personnel actions. The third collection is a diary that is perhaps too revealing about the lives that intersected the diarist’s. In each collection, it was not a matter of de-selecting a handful of questionable materials or omitting a series or sub-series from scanning. Substantial portions, or indeed the whole collection, are in question. Redacting the materials would be require more resources that the SHC (or any repository) has available. This is a moot point, since redacting the materials would render them useless for serious scholarship.

There was quite a bit of time for discussion by symposium participants. In each case, the audience cautioned that more time “aging” each of these collections was called for. From a scholarly perspective, these materials represent a treasure trove of information for 20th century scholars. There was a tension in our discussions — it’s fine to serve up these materials in the reading room to qualified scholars before the materials had aged. So if scholars get themselves to the reading room, they have access to the materials. Yet, many other qualified scholars are denied access to the materials because they cannot travel to where the materials live. Further, the assumption is that archivists can identify who is qualified to examine the steamy diary or the public health records, and who is not. I think this was something that made us all a little uncomfortable, even as “reasonable archivists.”

Another squirmer was how quickly we said to put these collections aside. Each collection was chosen for discussion because of it’s problematic nature, but I wondered how representative these collections are of the twentieth century collections in the SHC (or any of our collections)? Are 20% of the collections problematic in these ways? 80%? How much important primary source material must we age, and for how long? We were discussing putting some of these materials in the cooler for 70+ years, which would mean Vietnam War era materials would become available somewhere in the neighborhood of 2030-2045. In our roles as reasonable archivists, is that level of access reasonable? I am hearing the voice of the twentieth century scholar that I once was saying, “Um, excuse me.” (And indeed, one of the things the Extending the Reach of Southern Sources project heard loud and clear from scholars in their workshops: “Don’t forget about the twentieth century.”)

The second case study was presented by Maggie Dickson, and was appropriately titled “Due Diligence, Futile Effort: Pursuing Copyright Holders for the Digitization of the Thomas E. Watson Papers.” This is a great piece of work and I hope it is published soon (you can see some of the details in the Wiki). The SHC undertook an detailed analysis of a single collection, the Thomas E. Waston Papers. The results are stunning, but not surprising. The collection has approximately 8500 items, which translates into 3304 names. Bulk dates on the collection are 1880s-1920s, but the full range is 1873-1986. Analysis was done on the name represented in the collection in order to determine information that would be useful in determining copyright status, namely, the date of death. In the US, for unpublished materials this is the life of the author plus 70 years, which means that this year, works created by authors who died before 1939 are now fair game. (You start to feel kind of ghoulish when your work revolves around getting excited about knowing death dates.) From all this work (14 weeks worth of a full time effort), we learn that 18.4% of the collection is out of copyright, and that 33.32% is in copyright. For 47.55% of the collection, they were unable to make a determination. If you had some good tools, you could automate some of this analysis, but the end result means that a third of the collection is clearly in copyright, and you could guess that some good portion of the undetermined portion of the collection is also in copyright. If you could carry this work to its diabolically logical end and seek permissions before digitizing, you would be contacting thousands of people, some the rights holders themselves, undoubtedly many heirs. Denise Troll Covey from Carnegie Mellon University has presented compellingly about some work done to secure permissions for published works. The work is high effort and low-yield, even in an optimized situation. (See, for example, this presentation from the 2004 Spring DLF Forum (PDF). Working with published materials is far easier than chasing down unknown correspondents in manuscript collections and there’s no good way to optimize the work in some of the ways that Covey has done.

I am grateful that the SHC has undertaken this work; I claim that we now can firmly say, this is not how a reasonable archivist would spend his time. Instead, I think we should spend time coming up with reasonable and workable policies that will allow us to acknowledge the questionable copyright status of this material, and developing responsive and responsible takedown policies for materials that are problematic. At least that’s how I think reasonable archivists should behave.

I’ll continue this series with two panel discussions, one on ethical and professional issues, and the other on legal issues.

Large-scale digitization of special collections: legal and ethical issues (part 1)

Tuesday, February 24th, 2009 by Merrilee

I was fortunate enough to attend a symposium hosted by the Southern Historical Collection, University of North Carolina, Chapel Hill. The title of the symposium pretty much sums up many of my own interests and concerns these days: Legal and Ethical Implications of Large-Scale Digitization of Manuscript Collections. The meeting was quite meaty and worthwhile, more than I can possibly tackle in a single posting. So I will reporting on the conference serially, and speculate along the way about future directions and implications.

Some background
Archives and special collections are certainly doing a lot of good work to increase the “flow” of archival materials, both in terms of picking up the pace of describing collections and in terms of getting collections online. This is reflected both in the More Product, Less Process report and in our own Shifting Gears paper. There is an acknowledgment that digitizing collections, in whole or in part, gives greater access to collections. Some people use the term “democratizing” in reference to digitizing collections. However, there are still barriers to providing this more democratic access to collections, particularly for collections created in the twentieth century. Issues include legals considerations — not only copyright, but also in some cases complying with US regulations such as HIPAA (Health Insurance Portability and Accountability Act) and FERPA (Family Educational Rights and Privacy Act). Various of our states and jurisdictions have differing rules regarding how and when personnel records may be made public. I’m confident that in the European Union and beyond there is a similarly mystifying web of law, policy, and regulation that all conspire to damped the intentions of the most well-meaning and public-spirited archivist. Beyond the law, we must also grapple with what is ethical. Can putting documents online damage reputations, or hurt feelings? Will digitizing collections create a chilling effect on donations of materials to special collections? Providing a democratizing level of access to primary source material that comprises documentation of the twentieth century while balancing legal and ethical issues is challenging at best.

When we were putting together our own symposium, Digitization Matters, we ruled two issues out of scope: copyright and the mechanics of digitization. We felt that legal issues merited a separate lengthy forum and discussion. I also think that discussions around copyright get complicated and scary, and tend to paralyze institutions into non-action. Furthermore, copyright issues are irrelevant for a large body of materials.

Now, 18 months later, institutions are moving ahead with planning and execution of digitizing materials at a rapid pace. We can no longer defer discussions around copyright. And so, this is a long winded way of saying that this symposium and ensuing discussions come at a perfect time. I hope to summarize the presentations and discussions from that day, and point you at the symposium Wiki, where many of the presentations and other materials have already been posted.

Extending the Reach of Southern Sources
Before I dive into the meat of the symposium, it’s worth saying a few words about UNC’s Mellon-funded project, “Extending the Reach of Southern Sources: Proceeding to Large-Scale Digitization of Manuscript Collections.” (Some of this may be folklore, so please accept this as a good story if I am getting the details wrong.) In the careful-what-you-wish-for category, UNC applied to Mellon to fund digitzation of the Southern Historical Collection — all of it. Mellon, in the person of Don Waters, countered by asking UNC to restructure their request to address all the issues they would need to consider in developing a plan to digitize their collections. Those issues include, setting priorities (and taking into consideration the needs of scholars), copyright and ethics, and sustainability. Thus far, the project has focused considerable effort in gathering information in workshops with scholars, symposia such as the one I attended, and a good deal of fact-finding from those in the community. I am particularly looking forward to the reports from the scholars’ workshops. Findings will be held up against the holdings of the collection — the SHC is developing a matrix that will help to structure digitization priorities against a backdrop of need, resources, and risk.

Kicking it off: what about third-party privacy?
Aprille Cooke McKay kicked off the conference with a talk on third-party privacy issues. The talk addressed mostly legal, but also ethical issues, and brought in some field reports that we can look to in assessing risk. Aprille started out by referring to the SAA Code of Ethics, which says (in part): “Archivists protect the privacy rights of donors and individuals or groups who are the subject of records.” We have an ethical obligation, but also a legal obligation to uphold any duty of confidentiality, particularly when expressed in a donor agreement (a contract between the donor and the institution).

That’s a clearcut case of what must be done, but what to do in the case where the donor breached their legal duty by donating stolen or classified materials? In the case of Brown & Williamson v. Regents of California, a judge ruled that the materials stolen from Brown & Williamson by “Mr. Butts” and made available as part of the Tobacco Archive (now the Legacy Tobacco Documents Library) are of such public value that they didn’t need to be returned (or taken down once digitized). In any case, the breach is the donor’s liability, not the repository’s. There is little case law in this area.

Part of the reason that archives are so seldom (visibly) sued for disclosing private information is that paper documents are obscured — if there is “tale telling” it’s done by those who use the documents (scholars or journalists). However, in a digital environment, documents are less obscure. Here, we must establish standards of care, which are currently lacking. Here, Aprille said what I think was the key phrase for the rest of the conference: what would a reasonable archivists do?

The presentation also covered other areas of interest (or that should be of interest to reasonable archivists). What defines “private?” How the passage of time might soften the definition of what is reasonably private, how the law views privacy (not covered by US federal law, and state laws generally do not favor the overly sensitive). What constitutes defamation? (Here again, the documentation showing what the reasonable archivist would do is needed.) Libel (also state law) has a short statute of limitations, and a low win rate for plaintiffs. She also explored areas that would help to mitigate risk, such as “aging” material, creating good takedown policies, and being respectful of complaints, and developing a contingency fund to cover litigation. There are not a lot of great case studies in this area, and we need to be alert for them. Some brave institution could go forward on behalf of the community and create case law.

My takeaways from this presentation were two-fold. In many ways, we have been protected by the relative obscurity of paper documents held in discrete physical locations. I think in many cases, our current practices would not hold up to the scrutiny of protecting privacy and that conversations about what to digitize and what not to digitize will cause us to reflect on what we collect and give access to in the reading room as well as on the screen. To return to Aprille’s query, what would a reasonable archivists do? That is the question we all need to face. This reflection is a good thing for the archival community, and I look forward to the continued dialog.

And, as for the title of this post, we clearly need some shorthand for this problem. New acronyms, anyone? LSD:LE?

New at the Arcade: a multi-player Catalog

Monday, February 23rd, 2009 by Günter

The Brooklyn Museum, the Frick and the Museum of Modern Art have launched a shared catalog called “Arcade” with 800K records – take it for a spin here, and read all about it in this press release.

It’s a development I’ve been watching closely over the years, from the initial Mellon planning grant, the creation of the NYARC consortium (which also includes the Met), to the nascent efforts of involving other local public and academic libraries in collaborative efforts. OCLC Research supported this effort by analyzing the holdings of the four NYARC libraries, which provided a further impetus for joint work: when he crunched the numbers, my colleague Brian Lavoie found that 83% of the combined NYARC holdings were held by a single library (find the full report here [pdf]). From this vantage-point, providing better access to the combined holdings of these libraries creates a tremendously enriched resource.

A statistic of this sort (although not directly drawn from our report) also made it into the coverage of the New York Times, again testifying how this kind of evidence is seen as a major motivator:

“What’s interesting is that there is only about a 10 percent overlap in titles between the holdings of the museums,” said Anne L. Poulet, director of the Frick Collection, which runs the Frick Art Reference Library.”

Kudos to all of those involved in launching Arcade! I am looking forward to seeing the collaborative relationship among these New York City libraries deepen further as they continue their quest to better serve their users and reap economies of scale along the way.

The Future Now: LAM @ U of Calgary

Friday, February 20th, 2009 by Günter

Last Thursday, I was on the phone and online by 6:30am California time to see Tom Hickerson’s presentation “Convergence of Knowledge and Culture: Calgary’s Design for the Future,” webcast as part of our Distinguished Seminar Series. Calgary has gone further than most places in moving towards library, archive and museum (LAM) convergence: in 2006, a new department called “Libraries & Cultural Resources” administratively integrated a vast swath of the campus LAMs, and with the building of the Taylor Family Digital Library, many of these resources will physically come together in the same space in the very heart of Calgary University. Within the building, Tom envisions “an overlap so seamless that students won’t know who provided the service they’re consuming.”

Tom was careful to point out that proximity or a shared space are not a guarantee for convergence. Cross-departmental teams are currently working on detailed plans for collections, research, technology, staffing, etc. to ensure that the necessary cultural change happens alongside the plans for moving into the new building. One of Tom’s quotes I will remember: “As long as all budgets are separate, you are working against each other.” Among the key factors to set convergence on the right track he mentioned a top-down mandate, as well as an integrated management structure, both catalysts we also identified in our “Beyond the Silos of the LAMs” [pdf] report.

In between the lines, I also thought that I noticed some rather sizeable incentices: the Nickle Arts Museum, for example, will benefit from the move into the new facility by getting exposed to a stream of estimated foot-traffic of approximately 12K students/day right outside its space – one would imagine that this would increase the current visitorship of 25K/year significantly.

If you want to see how the future is built, you can watch the Taylor Family Digital Library grow on this webcam. If you want to see Tom Hickerson’s talk, the webcast should be linked to from here shortly.

We’re On The Case

Tuesday, February 17th, 2009 by Roy

Yesterday during what is celebrated as the President’s Day holiday in the U.S. (in honor of Washington and Lincoln, both with birthdays in February), I admit to being a bit disgruntled at having to show up for work. It was fairly obvious from the lack of traffic during my drive in, as well as the empty garage and office building, that we were some of the very few who were working.

You see, OCLC only provides seven holidays:

  1. New Year’s Day
  2. Memorial Day
  3. Independence Day
  4. Labor Day
  5. Thanksgiving Day
  6. Day after Thanksgiving
  7. Christmas Day

No President’s Day, no Martin Luther King, Jr. Day, no Columbus Day, that’s it.

But then I had a good conference call with colleagues here and in Dublin, Ohio. And I plowed through some email, read some documents, and got a start on a presentation. And I enjoyed lunch with colleagues I don’t often see who were here from Dublin and Seattle. And my day was capped off with an introduction to what will be our new platform for partner collaboration.

After all that I accomplished and experienced during the day I was in a considerably better mood on my way home, and it occurred to me that as a non-profit we need to make sure our members get as much value for their investment as we can possibly create.

So I no longer care if you were off skiing (Northern hemisphere), laying on the beach (Southern hemisphere), or watching a movie, or hanging out with your family. We were right where we belonged. On the case, creating value, working for you. Besides, it just makes Memorial Day look all that much better.

“Things don’t really get moving until a page is turned.”

Tuesday, February 10th, 2009 by Jennifer

I’ve slipped the draft of my survey of user studies into a drawer, walked away from my desk, and crossed the Bay to CODEX, a veritable orgy of international arts of the book.

This morning at the Berkeley Art Museum Emily McVarish shared her latest book, The Square, with over 250 artists, collectors, museum curators and librarians. She described the square as a public space riddled with hand-held technology. Has the city square – the grid of daily life – been replaced by the screen? Figures – derived from video clips of people walking streets talking on cellphones – move through the pages (squarish) of McVarish’s new book.

Are both the book and the city commons “breaking down into heterogeneous intangibles?” she wondered.

I was in the audience with a clump of RLG colleagues from Yale, Stanford and LC. They teased me, “Did they let you out?” I asked, “Aren’t artists’ books at the nexus of libraries, manuscripts and museums?” Librarians, archivists, collectors, curators and creators all recognize this paradox, since they collect the same stuff for difference contexts.

Thinking about McVarish’s work, I hesitate to present here a crude SAT-test syllogism: the synthesis of book and art is analogous to the relationship of page and screen. Earl Collier, on the CODEX board, silently waved his notebook, pencil, and PDA phone. “I like ‘em both,” he said.

The CODEX bookfair, afternoons during the symposium, is astounding. This beauty, creativity and global material culture can embody antithesis to LAM silos. At least until you start cataloging…

The distributed Centre: A Museum of Britishness

Monday, February 9th, 2009 by Günter

I’ve recently subscribed to the MLA News e-mail list, on which the Museums, Libraries and Archives Council of the UK pushes out a newsletter once or twice a month. It’s a very interesting view into the UK LAM community, and attendant political developments. The latest installment features an entry (online here) on planning for a Museum Centre for British History, and to my big surprise, I found out that under that heading, you won’t find the details for a massive new building project.

The MLA has consulted a range of leading experts, and has recommended the idea of a federated organisation that could draw on the collections in museums, libraries, archives and heritage sites across the UK to capture the story of Britain. The proposed centre would co-ordinate research and scholarship, plan thematic programmes, and schedule shows and events in places across the country.

Roy Clare, the MLA Chief Executive, is quoted as saying:

The Museum Centre for British History would be a national federated organisation (including museums, universities, scholars, research institutions etc), supported by a very small staff working in existing premises, that would pull together research, planning and programming around the theme of Britain’s story.

Pulling together the strands of British History from within existing libraries, archives and museums provides a unique proof-of-concept for how these entities can collaborate to tell this story. I am looking watching this new Centre take shape, and the lessons for substantive collaboration.

Herbert’s Adventures In Linking

Thursday, February 5th, 2009 by John

The title of this post is my homage to another famous Belgian.

I have been posting from the 9th International Bielefeld Conference in Germany. In yesterday’s closing keynote, Herbert Van de Sompel gave a most unusual presentation. Preparing, on his return to the Los Alamos National Laboratory, for a six-month sabbatical, he used the occasion to review the work he and his various teams have done over the past 10 years or so – and bravely assessed the success or otherwise of the major various initiatives in which he has been involved – SFX, OpenURL, OAI-PMH, OAI-ORE and MESUR (not for the acronymically faint-hearted). Incidentally, the 10-year boundary was as much accident as design. With the exception of one slide (pictured) showing his various project clusters, he had not prepared a new presentation, but instead paced around in front of a succession of old ones – some looking pretty dated – displayed in fabulous detail on the gigantic screen in the Bielefeld Convention Centre main hall. With a plea for more work on digital preservation, he stated that he had discovered that those Powerpoint presentations which were more than 10 years old were no longer readable.

The SFX development work, done at the University of Ghent, has resulted in some 1,700 SFX servers installed worldwide, which link – at a conservative estimate – to some 3 million items every day. Less successful, in his view, was the OpenURL NISO standard. It took three years to achieve, and – despite his ambitious intentions at the time – is still used almost exclusively for journal article linking. Reflecting on this, he remarked that the library community finds it hard to get its standards adopted outwith the library realm.

Herbert was also ambivalent about OAI-PMH. The systemic change predicted at the time of its development has not happened, and may never happen. He remarked that ‘Discovery today is defined by Google’, and in that context PMH did not do a good job because it is based on metadata. Ranking is based on who points at you (see my earlier post on the Webometrics ranking). ‘No one points at metadata records’. But it still provides a good means of synchronising XML-formatted metadata between databases.

He feels that we are moving on from a central concern with journal articles in any case. ‘What do we care about the literature any more? It’s all about the data (and let’s make sure that the data does not go the way of the literature!)’. He offered some reflections on institutional repositories in passing. They are not ends in themselves (though often seem to be). There is a difference between their typical application in the US and in Europe. European libraries use them more for storing traditional academic papers – versions of the articles which appear in peer-reviewed journals. In the US, there is a tendency to use them for ‘all that other stuff’. They are relatively unpopulated due to the fact that authors find it hard to care once they have had the paper accepted by their intended journal. But the other problem is workflow. Most repositories require deposit procedures which are outwith faculty workflows. Worse – content is being deposited by faculty all over the web – on YouTube’s SciTV, on blogs, in flickr. They have no time left for less attractive hubs. We need a button with the simplicity and embeddedness of the SFX resolver button to be present in these environments before we will truly optimise harvesting of content into the repository. There is a challenge …

The ORE work learned lessons from PMH. PMH did not address web architecture primitives. That was why Google rejected the protocol. It did not fit with their URI-crawling world view. ORE therefore used the architecture of the web as the platform for interoperability.

As for the MESUR project, directed by his compatriot Johan Bollen, Herbert described it as ‘phenomenal’. MESUR took the view that citations as a measure of impact were appropriate for the paper-based world. But now we should assess network-based metrics (the best known of which is Google’s PageRank). A billion usage events were collected to test the hypothesis that network metric data contains valuable data on impact. The hypothesis, he believes, was proved correct. There is structure there, and the ability to derive usable metrics. Indeed, the correlations produced by MESUR reached the fairly radical conclusion that the citation analysis data we have been using for decades is an outlier when compared with network-based methods.

Overall then, more plus points than negatives. And not only was his audience not inclined to criticise, but he was urged to stay and complete his presentation even though it ran over his allotted time by about 20 minutes at the end of an intensive day. How many people in our profession could discuss their work with reference to so many iconic projects? He concluded with a simple message – which he had come to see clearly as he prepared this review: we do what we do in order to optimise the time of researchers. Some recent studies, such as the UK Research Information Network’s Activities, costs and funding flows in scholarly communications (discussed earlier in the conference by Michael Jubb, Director of RIN), and the more recent JISC report, Economic Implications of Alternative Scholarly Publishing Models: Exploring the costs and benefits, express researcher time in cash terms. It amounts to billions of pounds each year.

How much money has been saved and so made available for further research by the projects developed and overseen by Herbert and his colleagues? There is optimisation to be proud of.

Optimisers, One and All

Thursday, February 5th, 2009 by John

Librarian can be a fragile and even uncomfortable designation in today’s world. Nonetheless, as our roles continue to expand, change and develop, it seems that librarian as an anchoring designation can become more necessary. We could easily imagine it sitting at the centre of a mind-map, with dozens of roles spidering out of it. On Tuesday, the first day of the 9th International Bielefeld Conference here in Germany, Wendy Pradt Lougee listed some new capacities which she would like to see in entrants to the profession. One of them was Leverager – a word that does not work well at least in UK English because leverage used as a verb is much rarer than in US English. One which I might add, on the basis of at least two of yesterday’s presentations, is Optimiser.

Isidro Aguillo, Director of Madrid’s Cybermetrics Lab (CCHS-CSIC), spoke about the optimisation of university websites. CCHS-CSIC publishes the international Webometrics Ranking of World Universities.

Isidro discussed the new indicators of institutional web presence strength which his group is developing, classified into three types. Impact (link visibility and analysis) and Usage (visits, downloads) are well known. More challenging is Activity (number of web pages and documents; number of academic papers in Scholar and other databases; frequency of invocation of researchers’ names; distribution of content and its translation into other languages; blogmetrics). Activity indicators are becoming more important, and librarians may have particular expertise to offer their universities as they seek to optimise their web presences through them.

The Cybermetrics Lab provides a Decalogue of good practices in institutional web positioning. I provide here an edited version.

The following recommendations are intended to give some advice to Universities and R&D institutions worldwide in order that they have an adequate web presence. Their websites should represent correctly their resources, activities and global performance, providing visitors with a true vision of the institution. We encourage medium and long term projects that give priority to the publication of large volumes of quality content under Open Access type models.
1. URL naming
Each institution should choose a unique institutional domain that can be used by all the websites of the institution. It is very important to avoid changing the institutional domain as it can generate confusion and has a devastating effect on the visibility values. The use of alternative or mirror domains should be disregarded even when they provide redirection to the preferred one. Use of well known acronyms is correct but the institution should consider including descriptive words, like the name of the city, in the domain name.
2. Contents: Create
A large web presence is made possible only with the effort of a large group of authors. The best way to do that is by allowing a large proportion of staff, researchers or graduate students to be potential authors.
3. Contents: Convert
Important resources are available in non-electronic format that can easily be converted to web pages. Most universities have a long record of activities that can be published in historical websites.
4. Interlinking
The Web is a hypertextual corpus with links connecting pages. If your contents are not known (bad design, limited information, or minority language), the size is small or they have low quality, the site probably will receive few links from other sites. Measuring and classifying the links from others can be revealing.
5. Language, especially English
The Web audience is truly global, so you should not think locally. Language versions, especially in English, are mandatory not only for the main pages, but for selected sections and particularly for scientific documents.
6. Rich and media files
Although html is the standard format of web pages, sometimes it is better to use rich file formats like Adobe Acrobat pdf or MS Word doc as they allow a better distribution of documents.
7. Search engine-friendly designs
Avoid cumbersome navigation menus based on Flash, Java or JavaScript that can block robot access. Deep nested directories or complex interlinking can also block robots. Databases and even highly dynamic pages can be invisible for some search engines, so use directories or static pages instead.
8. Popularity and statistics
Number of visits is important, but it is as important to monitor their origin, distribution and the reasons why they reach your websites.
9. Archiving and persistence
Maintaining a copy of old or outdated material in the site should be mandatory. Sometimes relevant information is lost when the site is redesigned or simply updated and there is no way to recover the vanished pages easily.
10. Standards for enriching sites
The use of meaningful titles and descriptive metatags can increase the visibility of pages.