Open Library – Making inroads and headway in all 50 states

I regularly trot out Open Library as an example of both a project that is nice and library like while also being attractive and usable and, at the same time, pushing the envelope of “how to be a library” in ways that are dignifying to both patrons and librarians alike. I was delighted to read this article about the results of a recent meeting where ALL state librarians voted unanimously to form an alliance with the Internt Archive’s Open Library project.

[Oregon state librarian] Scheppke said this allows libraries the chance to envision digitizing everything in their collection, from books about local history to works by local authors.

“If that doesn’t happen who knows when those books will become ebooks, maybe never,” Scheppke said. “That’s what really appeals to the state libarians; it’s a solution we haven’t had up until now to have a much more complete ebook collection,” he said.

The interface is us – what people think about ebooks

This is shaping up to be the year that people really start seeing ebooks and libraries as things that can go together. ReadWriteWeb just made this post about the Internet Archive getting into the ebook lending business, both via its collection of freely available ebooks as well as a pilot program with a small subset of libraries. This is terrific. It is also confusing. I followed the links in the press release and on the Internet Archive site itself and could not figure out exactly how I’d go about borrowing a book if I was a part of a member library (I have a Boston Public Library card). That said, wow the interface itself is knockout and just made me want to click around and mess with it.

Oddly the minor problem I had, and it is minor, is the same as the complaint that people who have used OverDrive via their own library to try to read ebooks. This reporter from the Wall Street Journal explains the headache that is trying to search OverDrive for available titles, those that are available for checkout. In order to check out and download an ebook, which I eventually did, I had to

- Search Open Library for ebooks
- Find one with a “borrow” icon next to it. OL also offers DAISY format for people who are visually impaired as well as many books that can be read locally.
- Get redirected to a search on OverDrive’s site saying “nothing available.” Redo search on OverDrive’s site to find this title available.
- Click WorldCat’s “find in a library” option and type in my zipcode
- Figure out that book is or is not available from my local library. Start again.
- When I find a book that is available, click through to my local library catalog & click “add to cart” to return to OverDrive (if book is available, which it sometimes isn’t)
- Take side trip to download Adobe Digital Editions (much less painful than previous OverDrive software experience)
- Proceed to “checkout” on OverDrive after entering a library card number that I think will work
- Download book. Read book.

So, not terribly bad and I think better interfaces and interactions between websites will make this process much more seamless. Right now I had to interact with Open Library, OverDrive, WorldCat, my library’s branded OverDrive page and my library catalog. At several stages during this process there are varying levels of “availability” of an item. Specifically.

- Book is shown in Open Library but is not available at a library I have access to.
- Book is available at a library I have access to, but not in the format I am looking for.
- Book is available at a library I have access to in the format I want but has been “checked out.”

Currently there is no one way to do a search for an ebook and have a result say “Yes we have it, it’s in this format, and it’s available NOW” I am optimistic that it is a matter of time before this is working and Open Library is currently making this work better than anyone else. Update: the Palm Beach County Library has a really nice interface that makes it a lot more clear what’s there and what’s actually available.

Moon Letters from The Cataloguer’s Desk


Before there was Braille, there was Moon. Check out these photos from some antiquarian Moon books. More on Moon. This post was made the same day that the Internet Archive announced that they have one million books available in DAISY format for blind and visually disabled folks. Not just talk, here’s the list of them. Image is from this book. [via]

OCLC kerfuffle, summarized in a way I agree with

Stefano Mazzocchi has a summary of the issues in the new OCLC policy dispute. Worth reading, mostly free of handwaving. [thanks peter]

What is up with OCLC?

This all started with a little wink-wink posting about OCLC from Tim over at LibraryThing which was the first I’d heard about OCLC’s policy changes. As someone who doesn’t interact with OCLC or their data too much, I didn’t really understand this and had to wait for some clarification posts to understand both what was going on and how it affected people and projects like LibraryThing and Open Library. The upshot as I understand it is that OCLC is basically saying “Sure you can share your records, but not with people or organizations who materially compete with us” That’s my summary anyhow. Here’s the non-legalese policy on the OCLC site. Here’s the more legalese version. Here’s a wiki version of the changes between the “old” new policy and the new policy. Isn’t technology grand? Karen Calhoun a VP over at OCLC has written a defense of the new policy on her own blog; there is some lively discussion happening in the comments. There is also this podcast of Roy Tennant and Karen Calhoun talking with Richard Wallis from Talis (whose business model is also potentially affected by this policy change) about the ramifications of this change.

So, the policy OCLC has put up has been revised somewhat, doesn’t go into effect until February, and gives people a lot of time to think about what if anything they want to do about this. Tim Spalding has a business model that is compromised by OCLCs refusal to let their members share these records. The Open Library project is also possible compromised and Aaron Swartz has written two posts about the policy change: Stealing Your Library: The OCLC Powergrab and OCLC On The Run. He also directs people to the Stop OCLC Petition if you’d like to sign on to ask OCLC to repeal these changes. More community discussion taking place at MetaFilter, Inside Higher Ed, and Slashdot and code4lib is maintaining a wiki with links to more commentary. I’m still catching up on the back and forth and may write more later, but it’s interesting to watch this unfold.

why I’d try an API

A few neat announcements in libraryland concerning data or connectors being made more open and available. These two examples may not seem as linked as they are.

  • LibraryThing releases (sort of) (almost) a million book covers, free for your use, under most circumstances. You can also cache the covers locally as long as you don’t do it in such a way that you support LT competitors. While I understand why this isn’t linked with the Open Library project, I’d love to see it get there in the future sometime. update: John Miedema reminds me in the comments that I’d meant to also link to the openbook WordPress plugin for people using WordPress.
  • WorldCat released their search API over the weekend. As with many OCLC things, this is great news for their member libraries and not that great for anyone else, but it’s a real step towards letting (their) people get at their data, not just their web pages. You can get some details, in slightly dense format, on this page.

Open Library, really open. Aaron Swartz discusses.

David Weinberger blogs about Aaron Swartz talking at the Berkman Center about the Open Library project. Pay close attention to the Q and A and think about this in terms of the Google Books post/article from yesterday. Who is really in faveor of openness? Who talks the most about openness? Want to help? They still need programmers. And book lovers.

Q: Why won’t OCLC give you the data?
A: We’d take it in any form. We’d be willing to pay. Getting through the library bureaucracy is difficult…
A: (terry) You need to find the right person at OCLC
A: We’ve talked with them at a high level and they won’t give us any information. Too bad since they’re a non-profit. Library records are not copyrightable. OCLC contractually binds libraries.

job opening: data munger needed for Open Library project

I rarely post links to job here because it seems to me that most postings for library jobs are more or less the same. This one is different. The Open Library project, which I linked to here before, is looking for some new folks. You’d be working with a fun team of geniuses, most notably Karen Coyle who is the chief librarian of the project. Telecommuting an option. Interested? Read the job description, then email Aaron and tell him you heard about it here.

Tasks include: working with our chief librarian, Karen Coyle, to implement algorithms to do data merging and other processing tasks; writing scrapers and crawlers to grab various data sources; writing importers to parse this data into something that can be imported into our database; and managing all the people who want to help us with these tasks.

Announcing Open Library

Someone asked me during one of my talks if I knew of any projects that were actually trying to open source cataloging records and the idea of authority records. I said I didn’t, not really. It’s a weird juxtaposition, the idea of authority and the idea of a collaborative project that anyone can work on and modify. I knew there were some folks at the Internet Archive working on something along those lines, but the project was under wraps for quite some time. Now, it’s not. Its called Open Library and it’s in demo mode. You can examine it and I encourage you to do that and give lots of feedback to the developers. Make sure to check the “about the librarianship” page

Imagine a library that collected all the world’s information about all the world’s books and made it available for everyone to view and update. We’re building that library.

Open Library/Open Content Alliance announcement from Archive.org

Hi. This is the presentation that Andrea and I are watching right now in San Francisco. The Open Library. Brewster Kahle is talking now and doing a book scanning demonstration. I like how he says “librarians” a lot.

Vision of an Open Library

The Web is So post-1996, what about older content?

Everyone is part of it: Amazon helps “expand the bookstore” but we’re looking for inclusivity.

“A great library for the published works of humankind, accessible to all… everybody involved… libraries LIVE based on the publishing system, they will be involved.”

3 to 4 billion of the 12 billion libraries spend every year goes to publishing. Let’s have more of that go to fairly compensating everyone.

“For the near term, we’re making books from books.” It’s hard to digitze a book that looks like the original, this is the proof that can work.

1. Selection. librarians choose books. Start with out of copyright materials, work towards in print, orphans next. “we’re not going to run out”
2. Scanning. 500 dpi “scribe system” 30-60 min per book. “we can read a 2 pt typeface, straight on” metadata, saved to archive
3. Cataloging. Use library data and coordinate between scanning centers using MetaFetch. Groups like RLG are coordinating.
4. Copyright. Copyright law is “a little confusing” Evidence based interface allows a Q&A “is this book under copyright” interrogation. Many books not re-registered copyright-wise. Already scanned copyright renewal records into a searchable database. Larry Lessig is bringing a suit re: orphan works and whether they can be in the virtual library. Other for-profits are working back the other way. It’s “tricky but doable”
5. Storage. 6 GB per book, hard to scale. Built a petabyte-scale machine “petabox” [I saw it] low power, runs cool, “set top boxes” not full computers with OSes etc. Object is not to have one box in an earthquake zone, but distributed system in flood zones & elsewhere.
6. Readers. Software. Check it out at openlibrary.org. UC librarians chose early set of books already scanned. Also looking into PDFs for printing. Also working with lulu.com for print on demand. Also, you can listen to these books.

Other mentioned projects: ICDL, Internet Archive Bookmobile [buck a book!]. BookShare will use this content for access for the blind. $100 laptop will be integrating books from this project onto their laptops [big news!]. Open Content Alliance to create protocols and formats.

Brewster Kahle: “I don’t know what it will be like to have books from our libraries injected into our culture again, but I’d like to see it”

“Knowledge for the World” is the mantra that all the funders [on and off the podium, 30 seconds each: Smithsonian (museums/content), Yahoo, Sloan Foundation (funding), Johns Hopkins (content/tech), RLG (cataloging), Adobe (display/doc formatting), HP (scan), LizardTech (data compression), Lulu.com (printing), MSN Search (search/funding) etc]

Guy from Yahoo “Finally a library I won’t get thrown out of” and “Find, use, share, and expand all human knowledge”

Andrea has more, including some links that I missed.