Fair! Google Books case dismissed.

original ferris wheel - from the Open Library

Karen Coyle has done an excellent write up of this so I will refer you there.

The full impact of this ruling is impossible (for me) to predict, but there are many among us who are breathing a great sigh of relief today. This opens the door for us to rethink digital scholarship based on materials produced before information was in digital form.

Folks can read the actual ruling (pdf) if they’d like. This is a very big deal. Thanks to folks who worked so hard on getting us to this place. I’ll add a few links here as they come in.

  • Kenneth Crews, Columbia Copyright Advisory Office: “This ruling joins court decisions about HathiTrust and electronic reserves in demonstrating that even extensive digitization can be within fair use where the social benefits are strong and the harm to rightsholders is constrained. There will be more to come as we transition into a new era of copyright, technology, and even reading.”
  • Brandon Butler, ARL Policy Notes blog; “The decision is a victory not only for transformative, non-consumptive search, but also for serving “traditionally underserved” libraries and their users, including disabled patrons.”
  • Paul Alan Levy: “This ruling provides a road map that allows any other entity to follow in Google’s path.”
  • Timothy Lee, Washington Post: “Many innovative media technologies involve aggregating or indexing copyrighted content. Today’s ruling is the clearest statement yet that such projects fall on the right side of the fair use line.”
  • Mike Masnick at Techdirt: “It all comes together in making a very strong argument that Google’s book scanning promotes the progress of the arts and sciences just like copyright is supposed to do.”
  • InfoDocket also has an updating list of links to discussion of the decision.

Google Books ngrams – on Hegel and Hitler and OCR

So hey this is interesting. I’ve skipped a lot of the Google Books ebookstore stuff lately because I’m honestly not sure what to make of it. And I don’t buy books anyhow. But a friend mentioned this Google Labs Ngram viewer, a fun tool that lets you search the full corpus of the Google Books databases. Here’s a New York Times article about it and data geeks should read the article Quantitative Analysis of Culture Using Millions of Digitized Books (free reg. required – click for PDF ILL) or nose around in the datasets. I did my own dopey search pictures above – Hegel vs. Hitler. And here’s what’s interesting. The big jump in the late 1940′s is fairly predictable, but who was talking about Hitler in 1620?

I clicked through and poked around some and here’s what I found. No one was talking about Hitler. OCR is, as you know, imperfect. So the words that Google Books’ optical character recognition thought of as “Hitler” were actually words like “Ruler” and “bitter” and “herbe.” How about that?

possibly the best library hoax

Jean Nepomucene Auguste Pichauld, Comte de Fortsas, was a man with a singular passion. He collected books of which only one copy was known to exist…. [W]hen he died on September 1, 1839 he possessed only fifty-two books, but each of them was absolutely unique. His heir, not sharing the old man’s passion for book collecting, arranged for an auction to sell off the library

Compelling no? The auction really happened, the rest of it is made up, the creation of a local antiquarian, having a bit of a practical joke. Read more at blacksundae, or see the auction catalog, itself a rarity, on Google Books.

DIY Book Scanner, details

I’ve mentioned Daniel Reetz’s DIY portable book scanner here before. It’s a great combination of an interesting thing to look at, an interesting project to contemplate and a bit of a gauntlet tossed down as far as bigger questions of why we leave scanning up to the big companies, etc. At the end of my Tiny Tech talks I usually mention it as something in the realm of the possible, even if in a Dream Big way. Daniel was at D is for Digitize last month — a conference I missed because I was in Nevada — and I noticed some interesting back and forth about his scanner project show up in the Library Law blog.

a few late summer links

I’ve been scooting around a little bit lately and here are some things that have been crossing my virtual desk. I’ve also dealt with two wordpress issues [a hack! and an outdated sidebar navigation element] and I’ve upgraded to the latest version of WordPress. If you’re on a Summer schedule, I’d suggest upgrading before things get hectic.

some copyright visualization

With the Google Books settlement coming up, a lot of people have been talking about copyright. I think this is generally speaking a really good thing. Here are some useful visualizations that may help you get your head around it.

- From the Financial Times is this article about what the Google business model could mean for out of print books and orphan works. According to their graphic [above] there are a lot of books wiht unclear status in US libraries that we should be concerned about.
- From ALA’s Copyright Advisory Network (a project of the Office of Information and Technology policy) comes a few helpful tools for looking at copyright as it pertains to libraries

EFF takes on Google Books privacy issues

Normally I’m not much of a joiner, but… “EFF is gathering a group of authors (or their heirs or assigns) who are concerned about the Google Book Search settlement and its effect on the privacy and anonymity of readers. This page provides basic information for authors and publishers who are considering whether to join our group.”

You can join too, if you’d like.

Cornell removes restrictions on public domain repros

An ongoing debate in the copyright wars is whether an institution that is making reproductions of public domain materials available should be allowed to dictate terms (usually involving payment) for use of those items. We all know that libraries need money. It’s also true that having digital copies of rare materials available helps preserve the original items. So, if I want to download a public domain book from Google Books — say John Cotton Dana’s book A Library Primer — I get usage guidelines from Google attached to the pdf I’ve downloaded.

Usage guidelines
Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing this resource, we have taken steps to prevent abuse by commercial parties, including placing technical restrictions on automated querying.

We also ask that you:
+ Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for personal, non-commercial purposes.
+ Refrain from automated querying Do not send automated queries of any sort to Google’s system: If you are conducting research on machine translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the use of public domain materials for these purposes and may be able to help.
+ Maintain attribution The Google “watermark” you see on each file is essential for informing people about this project and helping them find additional materials through Google Book Search. Please do not remove it.
+ Keep it legal Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other countries. Whether a book is still in copyright varies from country to country, and we can’t offer guidance on whether any specific use of any specific book is allowed. Please do not assume that a book’s appearance in Google Book Search means it can be used in any manner anywhere in the world. Copyright infringement liability can be quite severe.

These are all “suggestions” as near as I can tell. As with the Chicken Coupon fiasco of a few days ago, the implied threat that comes along with this item puts a bit of a damper on the joy that is the public domain. Bleh. We’ve seen other big corporations and libraries doing this as well.

However, this post is mostly to say “Yay” about Cornell’s decision to remove all restrictions on the use of its public domain reproductions. Here’s their press release about it and here is the web page with the new policy. What’s their reasoning? Well among other thigns it’s hard to support a misson of open access and at the same time go out of your way to make materials more difficult to get ahold of and interact with. You can see some of Cornell’s 70,000 public domain items at the Internet Archive.

unintended consequences of Google Books project

I was lucky enough to catch Brewster Kahle talking with Amy Goodman on Democracy Now on my drive home from NJLA. I feel like I’m pretty up on what’s going on with Google and the Internet Archive and book scanning. What I didn’t know is how Google’s agreements with libraries are hindering the IA’s access, not because of the contracts, but just because of differing priorities. The video and transcript are now available online.

AMY GOODMAN: Explain what you mean when you say it’s not legally required. You mean in the contract, what they have with Google? And so, if Google was here, they’d say, “We didn’t say they couldn’t give it to Internet Archive. That’s their prerogative.”

: Correct, that basically Google didn’t put it in their contract. Yet from a library’s perspective, why have a book scanned twice? It’s wear and tear on the books. If they think that—and they wouldn’t have signed it if they didn’t think that the Google thing was a good idea. But now that they’ve signed this with Google, they don’t want it scanned again. And this is a problem, because the books, even the out-of-copyright books, are locked up perpetually.

A few things going on, googley and otherwise

I’ve been reading more, typing less. My super-bloggy friends told me lat year sometime that a lot of their friends were blogging less and Twittering more. I was surprised to hear that since it hadn’t really trickled down to my neck of the woods yet, but lately it has. While I still stay on top of my RSS feeds, I suspect that I can only do that because people are blogging less. I don’t know if they’re twittering more, having babies, buying houses or doing something else. I know what I’ve been doing: reading.

I’ve also been travelling which is probably not a totally fun thing to read about [if I could delete everyone's tweets from airports, I would -- unless they're me looking for someone to hang out with when my flight has been delayed] but I go through periods of educating, followed by periods of learning, etc. I also made a resolution to myself for this year to write new talks (some similar slides okay, all similar slides against the rules) so when I give talks, they’re more work but also better, I think. I’ll be doing a 2.0 talk in upstate New York for NCLS and then a few talks at NJLA next week. Lots of writing, good stuff to pass on.

What’s been really on my mind lately is the Google Books settlement. I happen to be lucky that an old time friend of mine from the blogger days, James Grimmelmann, is one of the major players in the “explain this to everyone” field day that is going on. He’s also a keen legal mind and a great writer so it’s been a joy to read what he and others have been writing. Here are some links to essays that may help you understand things.