LACMA launches new collection site with 20k public domain images

The Los Angeles County Museum of art said on their Tumblr on Friday “Dear Tumblr-verse, Merry Christmas: we just gave you 20,000 high-resolution images, for free. Now we have just one question: what are you going to do with them?” This announcement is a next step in LACMA’s ongoing experiment to open up more of their collections to the public, via the public domain. They have more discussion and explanation on their WordPress blog. Do any search on their new collections website and you can limit your search to only those with unrestricted images. And then you can take those images and do… whatever you want. There is still a wordy Terms of Use page that people may want to dig through but the upshot is that folks should go use these photos, for anything. Stick them in Wikipedia, use them on your flyers and blog posts, use them for your album covers, put them on a t-shirt. Thanks for trusting the public, LACMA. Lovely stuff. Here’s the pull quote from their website that sums up why they did this.

Why would a museum give away images of its art? As Michael Govan often says, it’s because our mission is to care for and share those works of art with the broadest possible public. The logical, radical extension of that is to open up our treasure trove of images. When we first launched our early experiment with giving images away online, we heard a resoundingly positive response from many quarters: school teachers, parents, graduate students, journalists and the occasional creative person interested in printing their own Mother’s Day cards. So far, we have yet to hear of a situation where one of our public domain artworks has been misused or abused.

interview with Michael Barera, Ford Presidential Library’s new Wikipedian in Residence


White campaign tab with “WIN” in bold, red letters accompanied by a small red fish.

I had read with interest the articles that came out recently about the Gerald R. Ford Presidential Library getting a Wikipedian in Residence. For more info, see this a short article about the library’s exhibits coordinator Bettina Cousineau talking about the library’s participation in the GLAM-Wiki Initiative (Galleries, Libraries, Archives, Museums with Wikipedia), and a little more about the Wikipedian in Residence program.

I think this program is nifty and I was excited this time because the WiR is a Master’s student at the University of Michigan’s iSchool. I dropped him a line and asked if he wouldn’t mind answering a few questions. Here is a small Q&A (done over email) with Michael Barera about his new internship.

JW: The Ann Arbor Journal says you’ve been a Wikipedian since 2001. Is that a typo or have you been an editor there for over ten years? In any case, what first brought you to Wikipedia or the Wikimedia school of websites? What is your favorite thing about working on Wikipedia?

MB: 2001 isn’t exactly the true year that I started on Wikipedia: I found the site first in 2005, and made my first edit in 2006. 2001 is the year of the oldest photograph that I have uploaded to Wikimedia Commons, so in a way my contributions go back to 2001, although I didn’t edit Wikipedia or Commons until 2006. I was actually introduced to Wikipedia by my high school Western Civilization teacher in 2005, which is interesting because most people don’t have such an academic entry into the site: perhaps he was part of the reason why I’ve always taken it seriously.

For the first year or so, before I made my first edit, I used Wikipedia essentially as an extension of my social studies textbook: I’ve always loved how much more inclusive it is than the mainstream social studies curriculum in this country. My favorite thing about working on Wikipedia is sharing everything I’ve created or contributed with everyone in the world. We all chip in a little, and because of the CC-BY-SA and GFDL licenses, everyone gets to share and enjoy in the totality, all without ads or paywalls or subscriptions. I love the fact that it really is “the free encyclopedia”, both in the “gratis” and “libre” senses of the word.

JW: You went to UMich for your undergrad work and now you’re pursuing your Masters at the School of Information. Is this internship a natural outgrowth of what you planned to do at the iSchool or is it more of a side hobby that turned into a big deal? What are your interest areas at the iSchool?

MB: The beautiful thing is that it is both part of my career plan at SI and an outgrowth of a multi-year hobby. That’s why it is so perfect for me, because it allows me to use both my U of M bachelor’s degree (which has a concentration in History) and my knowledge and experience with Wikipedia, all in one package. In terms of my areas of interest at SI, I am specializing in Archives and Records Management (and maybe dual-specializing in Preservation of Information as well), but I’ve really enjoyed everything I’ve taken so far, from human interaction in information retrieval to Python programming to dead media. SI really is a perfect fit for me!

JW: Sort of a silly question but are you literally “in residence” meaning that you get to go work at the library? Or is it more of a virtual residency?

MB: I’m literally “in residence” at the Library four hours per week, but as you know Wikipedia can’t be confined to just one place at a certain time, so there is plenty of spill-over above and beyond these four hours. It is rather interesting to have an internship that literally bleeds into my free time, but I love editing Wikipedia, so I can’t complain!

JW: This project seems like it’s sort of a trial partnership experiment for both Wikipedia and a US cultural institution. What are you hoping will come out of this partnership in addition to the stated goals of making more of the library’s public domain holdings available via Wikipedia?

MB: Well, to be fair, a number of US cultural institutions have already had Wikipedians in Residence: the National Archives and Records Administration, the Children’s Museum in Indianapolis, Consumer Reports, and the Smithsonian Institution have all beaten the Ford Presidential Library and Museum to the punch. For me, the biggest goals of my internship (in addition to the obvious desire to improve content on Wikipedia) are to foster and maintain a relationship between the Wikimedia movement and the Ford as well as to encourage content experts, like the people I work with at the Ford, to create Wikipedia accounts and to become Wikipedians themselves. I know it can be daunting at first, but there are lots of long-time users who are happy to give their help and guidance, myself included. We won’t bite the newcomers!

JW. Do you feel a little odd about being in a fishbowl with all of your Wikipedia edits and actions being visible or is this par for the course for you? What do you think is people’s largest misunderstanding about Wikipedia?

MB: Well, all of my Wikipedia edits and actions have always been visible (that’s the nature of the MediaWiki software), and while there is certainly an upsurge in media attention and awareness about the internship or me specifically, I don’t think that there has been a dramatic increase in the number of people paging through my edits or watching my talkpage. On Wikipedia, I still feel like a private citizen: I think most of the media attention has been at a very basic level, and I think some of it struggles to grasp the nuances of what I am doing or even the structure of Wikipedia itself, which brings me to your last question. In terms of people’s largest misunderstanding about Wikipedia, I think it is the simple fact that we are an encyclopedia: a tertiary source without original research. We are not a blog or a forum for anyone to post whatever he or she wants to post, but rather a dedicated and thoughtful group of “collectors” trying to assemble the world’s best encyclopedia piece by piece, bit by bit.

I think we sometimes get lumped in with other social media sites, like Facebook and Twitter, and while there are a few commonalities (like the fact each is made up of user-generated content), Wikipedia really is a lot more like Britannica than it is like a blog, at least in terms of the content itself and the work that goes on behind the scenes.

[these are follow-up questions from a few days after our initial exchange]

MB: I’ve always loved how much more inclusive it is than the mainstream social studies curriculum in this country.

JW: I’m with you there. Are there any particular examples that stand out to you?

MB: During my elementary, middle, and high school careers, I discovered that my history/social studies education was essentially a history of Western Europe and North America. While the curriculum has improved dramatically in terms of coverage of Native Americans, African Americans, and Asian Americans in the last few decades, there is very little Latin American, Eastern European, African, Asian, or Oceanian history taught at the primary or secondary levels in this country (and just about all of it directly impacts the United States, typically in negative ways, such as Vietnam’s one cameo appearance in American history during the Vietnam War). I think the heart of this issue is the old belief that history is “national myth-making” is still alive and well in this country, at least below the post-secondary level.

On the other hand, I absolutely loved how different history is at the college level: as an undergrad at the University of Michigan, it was refreshing to take history courses covering nearly every corner of the world that both attempted to show that country’s perspective and then critique it at the same time. My modern French history (1871-present) and Soviet/Russian history classes were the best examples, and I would highly recommend my professors, Joshua Cole and Ronald Grigor Suny, to anyone: they do it the right way, and I for one wish I had more exposure to that kind of “real history” when I was younger. Long story short, Wikipedia is much more like this post-secondary, “real history” than “national myth-making”, so I always enjoyed how much more objective Wikipedia is (although not perfectly objective, of course).

JW: One of the things that has been challenging for me in Wikipedia outreach is trying to convince people that they don’t need to get someone to do the editing, that they can be bold and dive in. Do you have any particular approach to trying to get people to get comfortable making their own edits?

MB: My advice for getting people to start contributing is simple. The next time our hypothetical potential editor is on Wikipedia, I would encourage him or her to create an account and then just stay logged in while reading articles. Anytime he or she spots a small error, such as a typo or punctuation issue, he or she should just go ahead and change it. Actually, an account isn’t even needed: readers can (on most articles) make such minor corrections without an account, too. Still, this notion of starting small is the real key, in my opinion: just start with the little things and become comfortable with the editing interface (and the notion of editing a wiki itself), and eventually that new editor will feel comfortable making larger and more substantial edits. That’s how it was for me many years ago.

JW: Are there other online reference sources (crowdsourced or not) online that are your “go to” sites when you are trying to do research either for Wikipedia or your other projects?

MB: The resources I use for referencing Wikipedia articles are broad and diverse, and they range widely from topic to topic, as is to be expected. One commonality, though, is that I use a lot of newspaper and journal articles: in most cases, they are reliable secondary sources that are very good at establishing the core facts that lie at the heart of the Wikipedia article. One hint for maintaining NPOV is to try to recognize the different sources and balance them with each other. For example, on the article on the 2001 Michigan vs. Michigan State football game, I made sure to use both the U of M and MSU athletic departments’ press releases and game notes.

And, in an even better example from my work on the article Queens of Noise (The Runaways’ sophomore album from 1977), I tried to effectively balance multiple perspectives on the content, including the recollections of Jackie Fox and direct quotes about specific songs and events from Joan Jett, Cherie Currie, and Kim Fowley. Most interestingly, that article includes two separate (and contradictory) accounts of why Jett sang lead vocals instead of Currie on one of the songs, one given by Fox and the other by Currie. The key is to make it clear who is saying what where, and so like the “real history” taught in colleges and universities across the nation (and the world), the article has become an effort to show the different perspectives in conversation with each other instead of just giving one point of view (as is the case with “national myth-making”).

JW: Cheers and thanks for doing this for me.

MB: My pleasure! Thanks for the interview, and take care!

on public domain and “public domain”

There has been a lot of great writing about copyright and access to our cultural and intellectual history in the weeks since Aaron Swartz’s death. I have been retreading some of my old favorite haunts to see if there was stuff I didn’t know about the status of access to online information especially in the public domain (pre-1923 in the US) era.

I talk like a broken record about how I think the best thing that libraries can do, academic libraries in particular, is to make sure that their public domain content is as freely accessible as possible. This is an affirmative decision that Cornell University made in 2009 and I think it was the right decision at the right time and that more libraries should do this. Some backstory on this.

So, if I wanted to share an image from a book that Cornell has made available, I have to check the guidelines link above and then I can link to the image, you can go see it and then you can link to the image and do whatever you want with it, including sell it. This is public domain. The time and money that went into making a digital copy of this image have been borne by the Internet Archive and Cornell University. The rights page on the item itself (which I can download in a variety of formats) is clear and easy to understand.

Compare and contrast JSTOR. Now let me be clear, I am aware that JSTOR is a (non-profit) business and Cornell is a university and I am not saying that JSTOR should just make all of their public domain things free for everyone (though that would be nice), I am just outlining the differences as I see them in accessing content there. I had heard that there were a lot of journals on JSTOR that were freely available even to unaffiliated people like myself. I decided to go looking for them. I found two different programs, the Register and Read program (where registered users can access a certain number of JSTOR documents for free) and the Early Journal Content program. There’s no front door, that I saw, to the EJC program you have to search JSTOR first and then limit your search to “only content I can access” Not super-intuitive, but okay. And I’m not trying to be a pill, but doing a search on the about.jstor.org site for “public domain” gets you zero results though the same is true when searching for “early journal content” and also for “librarian.” Actually, I get the same results when I search their site for JSTOR. Something is broken, I have written them an email. [update: they fixed it!]

So I go to JSTOR and do a similar search, looking for only “content I can access” and pick up the first thing that’s pre-1923 which is an article about Aboriginal fire making from American Anthropologist in 1890. I click through and agree to the Terms of Service which is almost 9000 words long. Only the last 260 words really apply to EJC. Basically I’ve agreed to use it non-commercially (librarian.net accepts no advertising, I an in the clear) and not scrape their content with bots or other devices. I’ve also seemingly acquiesced to credit them and to use the stable URL, though that doesn’t let me deep-link to the page with the image on it, so I’ve crossed my fingers and deep-linked anyhow. I’m still not sure what I would do, contact JSTOR I guess, if I wanted to use this document in a for-profit project. Being curious, I poked around to see if I could find this public domain document elsewhere and sure enough, I could.

At that point, I quit looking. I found a copy that was free to use. This, however, meant that I had to be good at searching, quite persistent and not willing to take “Maybe” as an answer to “Can I use this content?” I know that when I was writing my book my publishers would not have taken maybe for an answer, they were not even that thrilled to take Wikimedia Commons’ public domain assertions.

As librarians, I feel we have to be prepared to find content that is freely usable for our patrons, not just content that is mostly freely usable or content where people are unlikely to come after you. As much as I’m personally okay being a test case for some sort of “Yeah I didn’t read all 9000 words on the JSTOR terms and conditions, please feel free to take me to jail” case, realistically that will not happen. Realistically the real threat of jail is scary and terrible and expensive. Realistically people bend and decide it’s not so bad because they think it’s the best they can do. I think we can probably do better than that.

A good old fashioned linkdump


Public domain photograph by: US Navy, National Science Foundation. Link.

I’m back at home after meeting with a lot of terrific librarians in four different states. March is the busy month and after last month my plan is “not getting in a plane more than once a month for work.” I’ll be speaking with my good friend Michael Stephens at the Indiana Library Federation District Six conference next week. I’ll do a wrap-up of the talks I’ve been giving sometime later but news for me is mostly having more free time to actually attend things and not just speak at them. Getting to go to programs at the Tennessee Library Association conference and the National Library of Medicine’s New England Region one-day conference about social justice has really helped me connect with what other people are doing in some of the same areas I’m interested in. It’s sort of important to not just be a lone voice in the wilderness about some of this stuff, so in addition to the SXSW stuff (and talking to a great bunch of library school students in Columbia Missouri) getting to attend library events as an audience member has been a highlight of the past few weeks.

However I’ve been backed up on “stuff I read that I think other people might like to read.” Try as I may Twitter is still for hot potato stuff [i.e. Google's April Fools Joke specifically, I felt, for librarians] and not for things that I think merit more thoughtful or wordy presentation. So, as I enter the first Thursday in over a month where I get to hang out at home all day, I’m catching up, not on reading because there is tons of time for reading while traveling, but on passing some links around. So, here are some things you might like to read, from the past few months, newest first.

Cornell removes restrictions on public domain repros

An ongoing debate in the copyright wars is whether an institution that is making reproductions of public domain materials available should be allowed to dictate terms (usually involving payment) for use of those items. We all know that libraries need money. It’s also true that having digital copies of rare materials available helps preserve the original items. So, if I want to download a public domain book from Google Books — say John Cotton Dana’s book A Library Primer — I get usage guidelines from Google attached to the pdf I’ve downloaded.

Usage guidelines
Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing this resource, we have taken steps to prevent abuse by commercial parties, including placing technical restrictions on automated querying.

We also ask that you:
+ Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for personal, non-commercial purposes.
+ Refrain from automated querying Do not send automated queries of any sort to Google’s system: If you are conducting research on machine translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the use of public domain materials for these purposes and may be able to help.
+ Maintain attribution The Google “watermark” you see on each file is essential for informing people about this project and helping them find additional materials through Google Book Search. Please do not remove it.
+ Keep it legal Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other countries. Whether a book is still in copyright varies from country to country, and we can’t offer guidance on whether any specific use of any specific book is allowed. Please do not assume that a book’s appearance in Google Book Search means it can be used in any manner anywhere in the world. Copyright infringement liability can be quite severe.

These are all “suggestions” as near as I can tell. As with the Chicken Coupon fiasco of a few days ago, the implied threat that comes along with this item puts a bit of a damper on the joy that is the public domain. Bleh. We’ve seen other big corporations and libraries doing this as well.

However, this post is mostly to say “Yay” about Cornell’s decision to remove all restrictions on the use of its public domain reproductions. Here’s their press release about it and here is the web page with the new policy. What’s their reasoning? Well among other thigns it’s hard to support a misson of open access and at the same time go out of your way to make materials more difficult to get ahold of and interact with. You can see some of Cornell’s 70,000 public domain items at the Internet Archive.

Working towards more public books, fewer orphan works

Public domain determination becomes clearer cut, more books entering the public domain thanks to … Google? Jacob Kramer-Duffield explains how Google and Project Gutenberg and the Distributed Proofreaders put their book-scanning and OCR-ing smarts into trying to solve the thorny orphan works problem to determine which out of print books have had their copyrights renewed and which haven’t. Neat. [via joho]

a small foray into Google Books

You can use the date operator to browse public domain books in Google Books. I’m not entirely sure why the covers of some of these books remain under copyright. Any ideas? I’ve also noticed a few scanning errors and some pretty neat finds like this one which gives the name of every librarian in the US and Canada working in a library holding over 1,000 volumes. Google Books clearly uses keyword indexing to make these books searchable. How great would it be to have this one in a database? You can see a few images that I particularly liked over at Flickr.