A few things that didn’t make it to the carnival…

There was so much good stuff in the Carnival yesterday, that I didn’t append some of my favorite links from the week, but here they are.

– Two links about Google Books. One is Scott Boren’s long piece on LISNews about full txt serching in books. What you can search and how you can search it. Great well-researched piece. The second is Julia Tryon’s contribution to FreeGovInfo concerning the amount of government information available via Google Books. Google provides no statistics. This will be part of an ongoing project she’ll be working on there, stay tuned.

When looking at the search results in Google for publisher field has GPO, I found 141,600 items, only 82,487 of which were available in the full view. And although it is nice to think that we have the full text for 82,487 documents, not all of them can be used. I randomly picked a title to see how it looked and chose the Statistical Abstract for 1954. The pages were clear enough to read easily but on every even numbered page part of the right hand column was chopped off.

– Also from FreeGovInfo comes this analysis of Google Video’s closing and what happened to all those DRMed video files that people supposedly “purchased” Please read Part I: DRM Killed the Files and also Part II: Why the Google Video story should scare you.

– Karen Schneider has been writing some great stuff lately. It’s been fun to see her getting into what I see as the more technical side of librarianing because her explanations of techie stuff are clear and free of nonsense while still being readable and engaging. Her article in Library Journal Lots of Librarians Can Keep Stuff Safe about LOCKSS and Portico really helped me understand the fairly complicated world of e-journal archiving.

– Bryan Herzog’s always-excellent blog has pulled some Reader’s Advisory suggestions off of ME-LIBS the Maine Librarie dicussion list and added his own commentary. Brian also made a custom book review search using Google’s custom search function. Very very nice. I’d love to see someone toss together a page of Google Custom Searches that were useful to librarians. Has anyone done this? I’ve already made a Custom Ego Search but that’s not the same thing.

Despite my Very Large Skepticism of Google in general, the tool itself is very easy to set up and is potentially extremely useful (especially for librarians). Basically, it lets you limit searching to a select group of websites – in this case, book review websites

Why non-scaling solutions are bad for public access to reources

Google Books has an enormous amount of material. This is good. However, they paint copyright restrictions with a wide brush and err on the side of protecting copyright holders. So, most content on Google Books that has been published post-1923 are restricted (possibly all, but definitely most). This may or may not be good for most people, but it’s certainly bad in some specific instances, like with government documents. These are in the public domain and yet you can only see “snippets” on Google Books. Rick Prelinger described this phenomenon last year. The problem still exists. The concern, apparently is that cop[yrighted material may appear within these documents — hearings especially — and since Google can’t spare the humans to do the due diligence, we all suffer with restricted access. [freegovinfo]

heavy meta parking lot

I like books because they tend towards linearity and being one little knowledge parcel of something. However more and more when I read (latest book: Book of Lists, 90’s edition) I have a little index card that I use as a bookmark — card catalog card, actually — that I make notes on. The notes often turn into Google searches, del.icio.us links, MetaFilter posts and emails to my Mom. My books become more than themselves by being dissected and shared.

So, this has been the theme for this weekend, a weekend that had me teaching my Mom how to use Greasemonkey scripts to show more photos on her main Flickr page. I also taught her how to use Grab to do screen captures, how to take long shutter photos with her camera and why del.icio.us is considered “social.” She even discovered she had fans on del.icio.us, what fun! Three other things that sprang up, regarding the meta level of things.

  1. Flickr Machine Tags – tagging is great, but most people agree that some sort of structured taxonomy complementing a folksonomy is a stronger and more useful way to make information findable. Enter machine tags. Also known as “triple tags” they add an almost faceted layer of classification to Flickr, but still in a totally “roll your own” way. So, for example. I took a picture of my Mom. She is also a Flickr user. In the past, I could add a tag that said “Mom” or “Muffet” (her user name) but there would be no way to explicitly link her Flickr identity to the Flickr picture of her except with a clunky HTML link which makes sense to a human reader but isn’t super clear to a machine. If you check the picture I linked to, it has a new sort of tag flickr:user=muffet which you create just like a normal tag, but it has parts to it. Right now it’s the Wild West as far as what you can build into machine tags — see hoodie:color=orange there aren’t really any standards or even accepted practices, but there are a lot of people doing a lot of talking and it’s an exciting time to be into taxonomies.
  2. Ed “superpatron” Vielmetti and I have been sending del.icio.us mail this evening. This diagram should explain everything.
  3. Back to books for a second. How great would it be if, while you were reading a book, you could have a graphical representation of the places talked about? Well, one of the rocket scientists over at Google Book Search is building just that sort of tool. Their post Books:Mapped explains a little of how it works. The about page of the book on Google Books will have a map, if one is available. Here is an example from David Foster Wallace’s book The Girl With Curious Hair or perhaps more dramatically The Travels of Marco Polo.

Google’s contract with UC and UM libraries for digitizing project reviewed

Now that Google’s digitizing contracts with two libraries have been made public, they can be compared and contrasted. Techie librarian Karen Coyle compares and comments. “[A]ccess is to be restricted to “those persons having a need to access such materials” which is about the vaguest access condition that I can imagine.” [experimenting with digg today]

google book search, what about govdocs?

One of the things I really enjoyed about the Internet Archive Open Library project was the software they used to attempting to determine whether works they were scanning were or were not under copyright. It was an elaborate set of questions and answers with access to some copyright databases. In contrast, unless I’m mistaken, Google Books just draws a line at 1923 and assumes everything after that date is in copyright. This includes government information which as you know is made with tax dollars and generally in the public domain. So why does Google Book Search treat all post-1923 books as under copyright? Just over-cautious?