I’ve always thought that one of the troubles with librarianship was that there are always more great ideas and projects than anyone has time for or can get funding for. As a result we outsource projects to the people who have time and money and thus lose control over the end product. I have no idea if Library Thing’s open source Open Shelves Classification Project is going to wind up looking like a library product or a vendor product, but I’m curious to find out. As Tim Spalding says “You won’t be paid anything, but, hey, there’s probably a paper or two in it, right?” I haven’t seen much chatter, blog or otherwise, about this just yet but I’ll be keeping my eyes open. Whether or not this project it ultimately successful, I think it’s an interesting grass rootsy way of looking at ideas of authority and rejecting the top down let-us-have-you-contribute-and-then-sell-it-back-to-you models we’ve been working under.
Someone asked me during one of my talks if I knew of any projects that were actually trying to open source cataloging records and the idea of authority records. I said I didn’t, not really. It’s a weird juxtaposition, the idea of authority and the idea of a collaborative project that anyone can work on and modify. I knew there were some folks at the Internet Archive working on something along those lines, but the project was under wraps for quite some time. Now, it’s not. Its called Open Library and it’s in demo mode. You can examine it and I encourage you to do that and give lots of feedback to the developers. Make sure to check the “about the librarianship” page
Imagine a library that collected all the world’s information about all the world’s books and made it available for everyone to view and update. We’re building that library.
You know it’s true. I’ve known Eli Jacobowitz since before he was born and now he’s a smartie techie type with a newish blog about technology and education. Though he admits “IANAL”, he has written a nice post about why cataloging both sucks and rules and talks about the future of cataloging in a world where there is much much more information than there is “trained professionals” to help people make sense of it. Ultimately, the answer lies in standards, and this librarians already know.
Eventually, robots might catalog for us. (Librarians shudder.) What we now know is just how far away that is – bot catalogers will need much better AI than currently exists. But in order for this project to even be possible, we have to make our data bot-readable. That means implementing some of the cataloging technologies invented and refined by librarians over the centuries.
We need to standardize meta-data format and content. Digital resources need not only meta-data but also meta-meta-data describing the standards they conform with. Catalog and search solutions need to read this information and pass it on when communicating with other systems.
I read a lot of blogs, but I don’t always follow through and read their links. I appreciate it when people whose opinions I trust can summarize long things for me. Sometimes I sumarize those things for other people. Tonight I am reading the twenty-four page report More on What is Going on at the Library of Congress prepared for AFSCME 2910 by Thomas Mann. You can find it linked off of this page, if you really like reading these sorts of things.
He makes a lot of interesting points that other scholarly types have been trying to make in more clunky fashion for quite some time. In short, libraries that still exist for the purpose of furthering scholarship are having a harder time doing it, both because of the shift towards electronic reources and the “it’s all on Google” mentality but also because our own institutions (LoC I am looking in your direction) seem to want to actively dismantle some of the better tools we have for organizing and accessing knowledge. I’m just pulling out a small part of this, but really you should go read the whole thing. Some people might take umbrage, but one of my favorite things about this particular presentation of the issues is that Mann really seems to have a well-researched opinion that he wants to get across without insulting anyone, having a hissyfit, or saying that other people are losers or idiots. It’s clear that he has a take on things, one that others would disagree with, but he lets his metaphors and ideas speak for themselves, even as he’s responding to people who I assume were disagreeing with his last paper on the subject.
The Continuing Need for Reference Librarians
What catalogs and portals cannot do, however, what classified bookstacks cannot do, what Internet search engines cannot do, what federated searching cannot do–these things can be done by reference librarians who, far beyond the capacity of any “under the hood programming,” are able to provide researchers with expert guidance on the full range of options available to them for their particular topics, in an intelligent sequence of use, with the best search options and sources segregated from thousands of blind alleys, dead ends, and mountains of unwanted irrelevancies.
Reference work, in other words, is not just a nice “add on” optional service; in its dual function of providing point-of-use instruction and overview classes it is integral to the efficient use of research libraries and to the promotion of scholarship in general. It cannot be replaced by “under the hood” programming improvements in library catalogs or portals, especially when such programming dumbs down multiple complex systems to a lowest common denominator of keyword searching–and also fails to search the vast arrays of resources that are not digitized at all.
Books are getting lost. When they’re lost people don’t check them out. When people don’t check them out, we think people don’t like them. When we think people don’t like them, we sometimes weed them (if we can find them). Why is this happening? Bad cataloging, especially in books written in non-English languages. What’s going on, and how is rampant copy-cataloging making the problem worse?
Recently, [researcher Joyce] Flynn checked Harvard’s less-than-25-year-old computer-based catalogue system, and discovered that many – perhaps most – of the Gaelic and Irish books with Na … titles are miscatalogued and so, in this odd way, are half-missing. That catalogue system is now the only way the public can access titles in the Harvard College Library collections.
“The issue goes beyond just Harvard’s Widener Library,” Flynn says. “Because Widener is often the first North American library to acquire and catalogue an obscure foreign language title, Widener’s cataloguing data frequently become the standard for libraries that acquire the book later.
former eternal co-editor K.R. is working on another book, this time about cataloging. Read her request for submissions and write to her with your good ideas.
There’s a lot of talk going on lately about whether cataloging as it has been done really matters in the age of Google and keyword searching. I’ve been reading about it a lot, both online and in the print materials sent to me by the Sanford Berman postal express, including his back and forth letters to the head cataloger at LC. Sometimes it seems that everyone starts with the same data point, but still arrives at different conclusions. So, the OCLC team [who has a dog in this fight] tell us that “Ordinary people do not search subject headings, Berman or LCSH. They search key words. ” which I think many people agree with. Then we read Thomas Mann [another dog-holder] who has a longish article in Library Journal about scholarly research and the ancillary functions of subject headings as more than just entry points to the information held in a catalog.
Keyword search algorithms, no matter how sophisticated their “relevance ranking” capabilities, cannot turn exactly specified words into conceptual categories. They cannot provide the linkages and webs of relationships to other terms (in a variety of languages, too), nor map out in any systematic manner the range of unanticipated aspects of a subject. Keyword searches cannot segregate the desired terms in relevant contexts distinct from the same terms used in irrelevant contexts.
In contrast, LC cataloging and classification—done by professional librarians rather than computer programs—accomplish exactly these functions that are so critical to scholarship. The search mechanisms created by librarians enable systematic searching, not merely desultory information seeking.
We all know Google is useful and is changing the way the average person searches for information. However, when we start to discuss whether Google is changing the way the average researcher does scholarship, then I think we have to be a lot more careful about understanding its [proprietary] mechanisms and thinking about what Google’s goals for Google are as well.
I was poking around on Amazon.com today and noticed two things
- They have changed my name from Jessamyn Charity West to Jessamyn West which means that clicking on my name gets you all the books by the other Jessamyn West. I can only imagine why this happened and, to be fair, they would be changing it back to how it was before. I complained and they changed it, but not before telling me that this sort of munging of author names was “a feature” of their system. The change is recent, the Google cache still contains my full name.
- Amazon’s Statistically Improbable Phrases which is a whole new approach to the sticky issue of “aboutness” Add ot this the existing tools of concordance and readibility and you’ve got two things 1) strong “keeping up with the Joneses” pressure to submit to the Inside the Book program 2) the beginnings of cataloging by robots.
This all came to me a day after getting a fat envelope from Sandy Berman which included, among other things some articles he had written about “bibliocide by cataloging” where subject headings assigned by
OCLC or LoC or OCLC member libraries and passed down to thousands of libraries via copy cataloging are so vague as to be essentially useless as finding aids. Do these Amazon features solve this problem or compound it? Eli also expands a bit on what I said about Google a few days ago; these issues are not disconnected.
“Why catalog in-house? Why catalog locally? And why not outsource the whole operation? Because critical, creative catalogers within individual systems are the last and only bulwarks against the often error-laden, access-limiting, and alienating records produced by giant, distant, and essentially unaccountable networks and vendors.”