One million books scanned at UMich

The Chronicle of Higher Ed has a short blurb and the dean of libraries Paul Courant has a longer post on his own blog.

The University of Michigan has hit the “one million books scanned” milestone. As far as I know Michigan is the first library to have one million books from its own collections digitized and available for search (and, when in the public domain, available for viewing.)

For more about the scanning project generally including some insight into why people call it controversial, there’s a good long article from Campus technology (link to printable version, all on one page) which gos into the logistics of the scanning program in some depth.

When it comes down to it, then, this brave new world of book search probably needs to be understood as Book Search 1.0. And maybe participants should not get so hung up on quality that they obstruct the flow of an astounding amount of information. Right now, say many, the conveyor belt is running and the goal is to manage quantity, knowing that with time the rest of what’s important will follow. Certainly, there’s little doubt that in five years or so, Book Search as defined by Google will be very different. The lawsuits will have been resolved, the copyright issues sorted out, the standards settled, the technologies more broadly available, the integration more transparent.

wrap-up before the wrap-up

As you know, I usually post the list of what I’ve read at around this time, but I haven’t read enough this year by my own admission so I will be adding a few more leftover links in this space and posting a “best of” list in a day or two. First of all check out what I saw in Boston.

sexy librarian?

It’s an ad for Sony’s “Reader Digital Book,” one of a zillion plastered all over the subway and train stations of Boston. I find it vaguely annoying, mostly because I find the commodifcation of reading annoying. The implication that some stupid computer is sexier than a real live person to help you with all your information needs? Stupid. Here are some other things left over from my inbox.

  • Well this was in my literal mailbox… I never renewed my ALA membership after 2006. Last week I got a “Hey former member, maybe you’d like to reconsider?” piece of junk mail from them. I’ve been very happy with my VLA contributions and interactions, moreso than I ever was with ALA. While I’m happy to see the good things that ALA is doing, the fact that I basically did everything I could to get off of spammy mailing lists and emailing lists only to continue to hear from them is a bit disheartening. That said, my ALA website logins still work despite me not having paid them a thing. It all balances.
  • The Michigan University Librarian has a blog. Not a lot there but I really enjoyed the first post: Being in Bed with Google.
  • Washington state is the latest battlefield in the “let’s cut library positions in schools to save money” debacle. There is a very organized group called Fund Our Future Washington that is trying to stop this problem before it starts. Here’s an LA Times article with more information and a good recent supportive editorial from the Seattle Times.
  • I am revising my review policy. People mostly don’t read it anyhow. In short, I am reading less and have less time for unsolicited books. While I still like to receive books that people think I may like, I do not want to set expectations inappropriately. The short form is: if you will be upset if I do not read your book, please do not send it to me.

That’s it until the booklist. Happy New Year!

How to “Get More from the Web than Google Will Tell You”

CIO, the companion website to CIO magazine, talked to me a few weeks ago about what people who only know how to search Google are missing out on, especially in a business/market research fashion. Here’s the article: Six Techniques to Get More from the Web than Google Will Tell You.

I don’t talk about it much lately, but when I was fresh out of library school I did some work doing market research and other miscellaneous stuff for a recruiter who worked a lot for Amazon.com and it was fascinating to look at the questions she’d ask to try to help them find the right person for the job. I had a sort of sideways approach to some of the topics we researched and that seemed to help her find good people. I like getting to talk to people about the importance of primary source material and the difference between going to a library’s list of good links on their website and talking to the librarian (in person or over IM) directly. I have mixed experiences talking to reporters but I was really happy with how Margaret Locher, an MLS holder herself, represented the things that Ann Cullen from Simmons and I told her.

“authorities” and strap-on sex

On my fridge I have a photocopy of a letter that Sandy Berman sent to the Library of Congress this August suggesting that they establish dildoes as a LCSH. I got many fascinating photocopies along with it for supporting evidence. I enjoy being on Sandy’s mailing list. Today, vickiep from del.ico.us sent me a link to “strap-on sex” as a new Library of Congress subject heading. Hooray! Unfortunately, links that go into the Library of Congress Authorities searches aren’t permanent but I was able to replicate the search and find the listing for dildoes in the weekly list for September 26th. Of interest to me particularly is that the authority record for strap-on sex contains Wikipedia, Google and “LC database” as notes in the 670 field. update: Tim at LibraryThing has a post showing the record.

Libraries Shun Deals to Place Books on Web, really?

Quick quiz: when you read a headline like the one in the New York Times today Libraries Shun Deals to Place Books on Web do you think that the libraries involved are

a) sticking up for free access to information
b) prohibiting free access to information

Now read the article and tell me if you feel the same way. The article was also published in the International Herald Tribune with the title Research libraries close their books to Google and Microsoft which was where I read it at FreeGovInfo yesterday.

friday evening linkdump of sorts

So, I don’t make you all sit through my deli.cio.us links auto-posting, but sometimes I have a few unrelated things to share that don’t really have their own full posts to go along with them. So here are a few things that are only sort of library related that I think you might be interested in.

The Decoration of Houses - book shelving chapter

IN the days when furniture was defined as “that which may be carried about,” the natural bookcase was a chest with a strong lock. These chests, packed with precious manuscripts, followed the prince or noble from one castle to another, and were even carried after him into camp. Before the invention of printing, when twenty or thirty books formed an exceptionally large library, and many great personages were content with the possession of one volume, such ambulant bookcases were sufficient for the requirements of the most eager bibliophile.

I enjoyed Henry Petroski’s treatise on book shelving called The Book on the Book Shelf. I am also enjoying Edith Wharton’s 1897 chapter on a smilar topic. [thanks will!]

A few things that didn’t make it to the carnival…

There was so much good stuff in the Carnival yesterday, that I didn’t append some of my favorite links from the week, but here they are.

- Two links about Google Books. One is Scott Boren’s long piece on LISNews about full txt serching in books. What you can search and how you can search it. Great well-researched piece. The second is Julia Tryon’s contribution to FreeGovInfo concerning the amount of government information available via Google Books. Google provides no statistics. This will be part of an ongoing project she’ll be working on there, stay tuned.

When looking at the search results in Google for publisher field has GPO, I found 141,600 items, only 82,487 of which were available in the full view. And although it is nice to think that we have the full text for 82,487 documents, not all of them can be used. I randomly picked a title to see how it looked and chose the Statistical Abstract for 1954. The pages were clear enough to read easily but on every even numbered page part of the right hand column was chopped off.

- Also from FreeGovInfo comes this analysis of Google Video’s closing and what happened to all those DRMed video files that people supposedly “purchased” Please read Part I: DRM Killed the Files and also Part II: Why the Google Video story should scare you.

- Karen Schneider has been writing some great stuff lately. It’s been fun to see her getting into what I see as the more technical side of librarianing because her explanations of techie stuff are clear and free of nonsense while still being readable and engaging. Her article in Library Journal Lots of Librarians Can Keep Stuff Safe about LOCKSS and Portico really helped me understand the fairly complicated world of e-journal archiving.

- Bryan Herzog’s always-excellent blog has pulled some Reader’s Advisory suggestions off of ME-LIBS the Maine Librarie dicussion list and added his own commentary. Brian also made a custom book review search using Google’s custom search function. Very very nice. I’d love to see someone toss together a page of Google Custom Searches that were useful to librarians. Has anyone done this? I’ve already made a Custom Ego Search but that’s not the same thing.

Despite my Very Large Skepticism of Google in general, the tool itself is very easy to set up and is potentially extremely useful (especially for librarians). Basically, it lets you limit searching to a select group of websites - in this case, book review websites

Google Answerers, a tally

Now that Google Answers is no longer an active project it’s easy to use Google itself to do some tabulating of who was actually doing what there. Using a pretty simple query the folks over at Web Owls have compiled a list of roughly how many questions each Google Answer Person answered. You can see me way down the list at 24. What’s interesting, to me, is how few people worked for such a high profile project, and how few people answered the bulk of all the questions to Google Answers. Interestingly, almost 40 of them are working over at Uclue which seems to have almost the same structure externally speaking as GA did.

Google Co-op @ your library

There have been a few Custom Search Engines made lately using Google Co-op. Let’s look at a few of them. I did searches for librarians, im, and jessamyn.

LISZEN — library blog search engine. Sexy wiki title list. Works like Google, looks like Google, keeps my settings from Google so I see results in sets of 50, nice! Actually, something weird is happening. If I search for a word like librarians, I only get the top 10 results, Google’s standard results. If I add a refinement by clicking on a link, then I get results 1-50 which includes my custom number of search results. However, there is something weird about the set-up, I can’t see any results after the top ten or so, the rest disappear into what I assume is an I-FRAME and I can’t get to the next page of results or see the nav at the bottom that would take me to the other results. Big trouble. Results seem to be sorted by currency instead of relevance. Results are also returned under the attractive, but large header which means you’re going to do a bit of scrolling. The results refinement is a little clunky. Limiting to “special libraries” just adds the string “more:special_libraries” to the query string, and I also got weird results limiting to academic libraries. The refinements are all just set up like radio buttons so you can only use one of the refinements and the interface is a little counterintuitive. I like the set of blogs represented.

Librarian’s E-Library from ALA, or “Vetted resources on Libraries and Librarianship from the American Library Association (ALA) Library” — I think every time we call something an e-something it’s subtly implying that the normal version of that thing is not electronic. So when we say ebook we are saying if it’s electronic, it’s not a normal book. I think this is wrong thinking perhaps. Their interface is nowhere near as sexy as LISZEN’s but it does seem like they are trying to add some neat widgets like a list of some representative sites and other contributors. The results list is clean, looks just like Google’s with a little touch of red to remind you that you’re searching ALA’s version. Results seem to be sorted by relevance. I didn’t see any blogs represented except for official type blogs, so the results here complement LISZEN’s fairly well.

A few others I saw in the line-up include the ARL Libraries Search, Phil Bradley’s Librarian Weblogs and the Library 2.0 Feed Search