Library tech

On the Commons

November 23rd, 2015, By Lucy Schrader

Put a bird on it

A big old owl face.

Detail of Morepork, Spiloglaux novae-zealandiae and Laughing owl, Sceloglaux novae-zelandiae, by John Gerrard Keulemans. See it on Flickr.

Our collection of entirely free images on Flickr just got 3,500 high-resolution additions. Thanks, Python!

Everything we’ve uploaded to that account is yours to use in whatever way you want. These are images that have no copyright, donor restrictions, or anything else that might get in the way. They’ve also been uploaded at the highest resolution we have.

You can download the images, use them online, print them, turn them into lovely tea towels or linoleum.

(We’d love it if you linked back to us when you do, you don’t have to.)

The new uploads came from our free download pool, which is full of – you guessed it – images you can download for free. So why did I spend far too long writing terrible code to pipe them all up to Flickr?

First, there’s a ton of people on Flickr who are never going to come to our site. Why not get our stuff right in front of their faces?

Secondly, our interface for downloading these images isn’t great. You need to log in with RealMe and take the image you want through our whole image ordering process. On Flickr, zooming in to see details or downloading the size you want is far easier.

Lastly, I wanted to show that sharing our open content is possible at a large scale. I’m hoping this is going to make it easier for us to open up and share more from the collections.

Details of five images uploaded to Flickr.

Details of Greymouth and Kumara Tramway; Wellington Corporation Tramways ticket; Sarah Ann Featon, Yellow Kowhai; Soldiers repairing a car; Wellington Physical Training School.

What’s next?

We’ll be releasing more images, at the collection level, as they’re checked and cleared. That’ll probably mean repeatedly needing to upload dozens (or thousands?) of images in a go.

When that happens I’ll have a much cleaner script that is easier to use, and more reliable. From start to finish, it should:

  • Go over the list of images we’ve picked out for inclusion

  • Check if an image has already been uploaded, discard it if so

  • Get the list of ID numbers we’re working with and send it off to NDHA so we can get the high-res tiffs

  • Rename the files with their DigitalNZ identifiers

  • Smoothly handle authentication with Flickr

  • Grab all the metadata

  • Sort images into their collections

  • Upload!

  • Mark each upload as done so we can run it in batches

Have a look at the code on GitHub

Read on for technical muckery!

Bleep bloop

Moving the files and all their info had several steps:

  • Identify the free downloads

  • Check if any were already on Flickr, and see what was on Flickr but not in the Free Downloads

  • Get the source files

  • Get their metadata

  • Upload them with all that metadata

And to avoid getting lethal clicking finger strain or dying of boredom, do it all automatically 3,500 times over.

2 APIs and a bunch of messy Python

To write this code I picked Python, because it’s the language I’m most familiar with, and it’s well supported with helpful modules and advice around the web.

Step one was building a list to tell me what’s actually in the free download pool. Happily, natlib.govt.nz runs off the DigitalNZ API, making that information findable via API.

Post a blog comment
(Your email will never be made public)
Amy Joseph
1 December 2015 2:46pm

Great post, Reuben. Awesome outcome, and cool to see your process set out.

Reuben Schrader
1 December 2015 1:30pm

Hi Jane, that's really annoying - it never occured to me the images would be walled off like that, so I didn't test it.

I guess we have to consider this a step in the open access direction, and think about some other ways we can make the images available.

Jane
1 December 2015 12:26pm

If these are "free to download" why does yahoo require me to login to view the various sizes? Or create a yahoo account.

Jessamyn
30 November 2015 11:03am

This is the BEST. Thanks for not only doing it but explaining how and sharing the code.

Tara Calishain
29 November 2015 7:27am

Great stuff! Thanks for taking the time to document how you used Python to get this done. It'll be in today's (Saturday's) ResearchBuzz.

Siobhan Leachman
26 November 2015 6:59pm

Hi Ruben

I've been doing some machine tagging of the images following the "how to" in both http://blpublicdomain.wikispaces.com/Machine+tags and also http://biodivlib.wikispaces.com/file/view/Tagging_Tutorial_and_FAQ.pdf/514108856/Tagging_Tutorial_and_FAQ.pdf Hopefully this will ensure the images are more widely used. Thanks for doing this and may the Library continue to add to Flickr commons!

Harry Chapman
23 November 2015 9:02pm

Hi Reuben - thanks for the post and for your efforts. I strongly encourage you to also upload the images to Wikimedia Commons too if possible :-)