Too many DonsSeptember 14th, 2007
Disambiguation is becoming an everyday word (you might have caught its starring role in Wikipedia's Disambiguation Pages) but I hate how it always comes last, after a long list of so called "more interesting" things. So today, right here, right now, dis-am-big-u-a-tion is standing up and asking you to reflect on how it is in fact a word unlike any other, the dictionary's peacemaker, a powerbroker tucked into your lexicon. Even according to the internet, that veritable stronghold of ambiguity, the verb disambiguate has one meaning. And you have to admit it's a nice word to say out loud. If you haven't already done so today, try it and experience a little joy.
While we are still heading way out, hopefully on a journey to the point, shall we muse on why telephone books have addresses in them? They are not called "Telephone and Address Books" after all. I will accept it's handy to have the address right there by the phone number, but there are plenty of other similarly handy things that we could include if we were a phonebook publisher with time on our hands. Thinking of examples is left as an exercise for the reader - and you may use the comment button below. While you do that, I am going to suggest that the address is included mainly to assist you in determining whether this McLean, D. is really the person you want to call, thereby saving you an awkward few minutes on the phone to his namesake. In other words, it is there for disambiguation purposes.
The tools of our age are changing us, and due to the ubiquity of search engines (or perhaps just present-day shortcomings in the automated indexing of web pages, but let's not go there) we are becoming more accustomed to ambiguity. And don't get me wrong: on the streets I like ambiguity as much as the next person. Not understanding the things I encounter as I walk through this wondrous urban environment is both exciting and enriching and I wouldn't want it any other way. But when I stop at a store, normally with something in mind, I often find to my delight that the shopkeep has mastered their limited domain and removed most of the ambiguity within. I appreciate this because if, every now and again, I received a glass of milk instead of a cup of coffee when I ordered a latte, I might find myself irritated. You probably know the feeling right now.
My point, which I am going to get to, is partly about the relationship between the scale of a domain and the level of ambiguity in that domain. You can't disambiguate the world wide web any more than you can tame nature or perfectly order an urban street milieu. However, within the extensive yet finite domains of the Galleries, Libraries, Archives and Museums sector (GLAMS if you will) there is the potential to remove most ambiguity from the collection items themselves and the names used to index them. And this has long been done within each organisation, both in order to better serve the public and to meet the organisation's own collection management needs. It's a beautiful thing to behold.
By way of example, let's say a member of the public walks into our very own Alexander Turnbull Library to do some research on the unpublished items relating to Sir Donald McLean. Coincidentally (you might say), we are currently digitising about 90,000 pages of Sir Donald McLean's papers, but unable to wait, this excited member of the public has just walked into our library to do some research on him today. When they type Donald McLean into Tapuhi, they will see that there are a few different people by that name. After some Tapuhi-wrangling and/or the assistance of a librarian, they will presumably converge on a particular Donald McLean, which for this example is going to be the person we know and love as #4809:
McLean, Donald (Sir), 1820-1877: Administrator, runholder, politician, provincial superintendent. Was Crown's Protector of Aborigines, Native Land Purchase Commissioner and Minister of Native Affairs. Father of Sir Robert Donald Douglas Maclean (1852-1929). Made KCMG in 1874.
Armed with this information they are able to carry out a non-ambiguous, definitive search for all the material relating to this man, and not have to pick through material related to any of the other reasonably famous people with exactly the same name. Brilliant.
That, at least, is how it has worked for very many years. After leaving ATL our researcher walks down the road to Archives New Zealand and repeats the process there, hopefully with the help of more librarians. And so on. Seems reasonable enough. Good for librarian employment too.
But in our superconnected modern world this person now wants to do the research from home. Most of the organisations have very helpfully made their catalogues available online, so the researcher gives it a go. The trouble is that although they've saved a walk (and, ok, possibly a trip to Wellington), the disambiguation step still needs to be carried out at each organisation's website, alone and unsupported. Doesn't feel so much like the future anymore.
Ideally our user would be able to disambiguate once (by selecting which Donald McLean they're interested in from a master list) and then search across the whole sector for this exact person. Some kind of portal that would then run the search for the right Donald McLean on the various organisations' websites and compile the results in a list would be just right. While we're dreaming, let's give this Donald McLean a unique universal persistent identifier like #2349837941, so that our user can reference him easily and unambiguously in publications and conversations anywhere in the world, forever (that's a good thing, provided you're not trapped in one of those conversations).
Unfortunately, because the unpublished collections in this sector grew independently, way back, yes, before the internet, there is no standard way to refer to Sir Donald McLean - or pretty much anyone. Which as we shall see in my next post is No Small Matter, and what's more, Google won't save us...
Tune in next week when we will line up and knock down various solutions. In the meantime, feel free to check out the People Australia initiative and the pioneering work being carried out by the New Zealand Electronic Text Centre.