TAPUHI takes a bow

On 22 May 1992, an event of some significance occurred in the Manuscripts Reading Room of the Alexander Turnbull Library. A scrapbook compiled by suffragist and temperance campaigner Helen Lyster Nicol was issued to historian Fiona McKergow, not by recording it by hand in a book, but on a printed slip generated by a database. This scrapbook was the first of more than 299,000 items that would be issued to researchers over the ensuing 24 years.

The database in question was the Turnbull Automation Project for Unpublished Heritage Items, or TAPUHI. This inspired acronym is also a Māori word meaning to nurse or care for, and the name of a tugboat on Wellington Harbour. Both of these are appropriate for a system that has brought the collections of the Alexander Turnbull Library to researchers the world over.

Fiona was doing research for Women together: a history of women's organisations in New Zealand: ngā rōpū wāhine o te motu, edited by Anne Else, which was to be published in 1993. The Nicol scrapbook had been acquired only 18 days previously, reflecting the new ability to rapidly process material and make it findable by clients.

Solutions sought

The story began some two years previously, when the Library, led by Manuscripts curator David Colquhoun, began planning an automated system that would enable them to get control over the massively expanding collection of diaries, letters, business records, writers’ papers and other documents that they had been building over the previous two decades, which was becoming unmanageable using the existing structure of index cards and paper inventories.

From the outset the initiative was championed by Chief Librarian Margaret Calder, who was adamant that the chosen solution had to accommodate all aspects of the management of unpublished collections: initial contact with donors; accessioning; arrangement & description; tracking item movement; and issuing items to researchers. It had to facilitate the application of the archival principles of provenance and original order. Finally, it had to minimise the amount of inputting required; information recorded in contact and accession records should be pulled through to arrangement & description records without the need to re-enter data.

Over 1990 and 1991 a small team of Library staff, under Systems Librarian John Etheridge, worked intensively to define the Library’s requirements, prepare a business case, and tailor the chosen solution to meet the library’s specific needs. The team included Kevin Stewart, Kevin Bourke, Libby Kitchingman and Tim Lovell-Smith from the Manuscripts Section, and worked in close collaboration with Alma Hong from NZBN.

Ms Papers 4394 RecordThe familiar plain, white view of the TAPUHI OPAC that researchers and staff use to find and access collections.

John had started work in the Library in the late 1980s, when he was involved with setting up one of the Library’s first forays into automated systems, a donations database that started operating on one PC in the acquisitions section in September 1988. Its functions, like those of the 1987 oral history database, would ultimately be taken over by TAPUHI. He became Systems Librarian in 1990, and the major focus of his work was getting the project up and running.

A successful outcome relied on establishing the need for separate systems to manage published and unpublished collections. The successful solution, proposed by General Automation and based on their AWAIRS software, was able to manage the complex internal relationships within collections of manuscripts, photographs and other formats which are found in unpublished collections.

The system was already being used by a number of Australian galleries and museums, most notably the Mitchell Library in the State Library of NSW in Sydney. The report produced by John and his small team, recommending using AWAIRS as the development environment, was sufficient for Margaret to convince NLNZ that a specialist solution was preferable to a standard library application for managing the library’s unpublished collections.

Day one

On Monday 2 September 1991, staff began entering data. We don’t know exactly who made those first records, as staff were using a shared sign-on that effectively rendered them anonymous. We do know that they made 14 records that day, all of them contact records documenting potential acquisitions.

On Wednesday 4 September they made their first accession record, documenting the acquisition by the library of a manuscript log of the proceedings of HMS Dromedary, 1819-1821, made by midshipman Perceval Baskerville. The same day they made their first descriptive record, for the recently-acquired extracts from the WW2 diary of Douglas Neil Tiffen. This was the first record to become available to the public.

TiffenAn image of the very first descriptive record saved to TAPUHI showing the command line interface of the staff view. Keyboard shortcuts were often the quickest way to navigate in an environment with no mouse and only the up and down and left and right arrow keys able to move the cursor. Ref: MS-Papers-4394

Some 18,000 descriptive records had been added by the time that the system, now named TAPUHI, was officially launched in June 1992 by Roger McClay, Minister for the National Library, in a function at the Tugboat on the Bay restaurant – the former tug Tapuhi. The name arose from discussions among staff. Sharon Dell, who at the time was Keeper of Collections at the Turnbull, remembers that, while acronyms were considered, the staff wanted a name that was meaningful in itself. In the end the name was chosen first and the acronym constructed to fit.

A revelation

In 1993 I encountered TAPUHI. That January I arrived back into the Library after 6 months parental leave to resume my role as Curator, Photographic Archive, to find that work had forged ahead to extend the scope of the system by developing new accounts tailored to accommodate the descriptive needs of the Photographic Archive and the Drawings, Paintings and Prints collection. Inputting on the Photographic account started on 15 March, and the next day I made my first records; for a group of albums made by the Library from photographs donated in 1923 by Russell Duncan.

Chief Post Office Mailroom, Wellington 1920. AAME 8106 W5603 Box 126, Archives New ZealandPa of Ruatara at Rangihoua, Bay of Islands, photographed 22 July 1914 by Russell James Duncan. Ref: PA1-o-141-87

I still remember the excitement of making those first records. The realisation that I could take information gleaned from widely-scattered index cards, donation books and annual reports, and turn them for the first time into a permanent record of the provenance, physical identity and intellectual content of this group of images was quite literally thrilling.

The knowledge that on the basis of those initial accession and descriptive records for the group, others could go forward and describe each album, and beyond that the individual pages within each album, and that all of these records rested within a robust, linked hierarchical structure which clearly defined their inter-relationship, was a revelation.

The fact that we could also now record each time one of those items was issued to a researcher, each time a copy negative was made (no digitisation yet!) and each time it required conservation treatment, was an added bonus.

For the first time, I realised that it was possible to gain intellectual control over a collection of images which at that time included over 100,000 glass negatives and 1000 albums, and to recover the provenance and original order of collections which had been separated into subject and format sequences over the past 50 years.

The system continued to be developed throughout the 1990s. At the end of 1993 it took over the functions of the 1988 donations database, and a conservation account, which allowed remedial treatments to be recorded, was added. In 1997 an Oral History account was added. In June 2000 a web interface was launched, enabling the manuscript and pictorial collections to be searched online. In 2014 the Oral History account became available on the internet.

In the 24 years since TAPUHI was launched nearly 800,000 letters, diaries, photographs, watercolours, oral history interviews and other items have been described on TAPUHI. More than 18,000 people have been registered as readers, and nearly 300,000 collection items have been issued to them for study in the reading rooms. It has revolutionised the way we think about managing our unpublished collections.

SpoolerA view of TAPUHI's spooler that records commands sent to the printer, which for labels is an old dot matrix printer.

The next chapter

Over the years, however, TAPUHI’s shortcomings have become apparent. Inputting into the text-based, command line user interface of a quarter-century old legacy application is a barrier to those who have been brought up on point-and-click window-based systems. It does not meet 21st Century standards of data security and privacy protection. It does not play nicely with modern IT systems.

Its structure of separate accounts, admirably suited to recording the characteristics of the various formats and the variegated scraps of information associated with them, necessitated the splitting of multi-format collections, thereby breaking the chain of provenance and original order essential to derive the full richness of meaning inherent in them. It cannot accommodate international coding standards such as Encoded Archival Description (EAD) and Encoded Archival Context (EAC), which have been developed to make archival descriptions and finding aids available on the internet.

Therefore, the Library will shortly be introducing a new collection management system for unpublished collections. We look forward to sharing these developments, including the name of the new system, with you shortly.

By John Sullivan

John is the Curatorial Services Leader in the Alexander Turnbull Library.

Sara Phillipps Barham October 9th at 1:35PM

So many memories in 1 blog! thanks John. a great read.

Charles Dawson November 3rd at 10:26AM

thank you John; neat information and a testament to the importance of librarirans and ATL's work, and to institutional memory.

Jan Rivers November 4th at 9:36PM

I was working at AWA Computers in Ponsonby Road in Auckland in 1991 when the tender from the Turnbull Library for what was to become Tapuhi arrived. The acronym AWAIRS stood for AWA Information Retrieval System and responding to the tender with AWAIRS was a one-off for me as the other software that I provided technical support for was the URICA library system. Both were developed in the operating environment called Pick –


as was the local government software supplied by the company which ran everything from dog registration to asset management. (According to the Wikipedia article Pick persists in some specialised environments) Answering the tender was the easiest I had ever done as is the answer to almost every question was that AWAIRS would comply fully.

This was because Pick could be used for any text based application, and was stronger than most in managing lots of data and it was one of the few operating systems that could genuinely support a proper relational database which at that stage were in their infancy. So the general tenor of the question and answers was as I recall along the lines Can we have a field for this and that kind of date? Answer Yes; can we have a text field with these characteristics? Answer yes; Can we represent the relationship between the main record holdings record and a copy record and use the data to search from two holdings to main and vice versa? Answer yes.

Among the features which Pick had which lifted it above the competition was that the ability within the same field of a record to represent sub values and even sub sub values. They were delimited using non-printable ASCII control characters ASCII control characters. 252 · 253 and· 254 and the benefit of this feature was that it had that was that it allowed the database to represent zero, one or multiple headings of the same field which was useful for subject, author and notes fields for example. Text fields could be of almost unlimited length and it also had an enquiry language called variously ACCESS or ENGLISH which allowed sophisticated reporting well before other databases were able to do anything similar.

Finally for fields that held normalised data – title, author, publisher, subject and so on were indexed using a truncated version of the field with some letters missing. As I recall this feature sometimes caused problems with lists of returned headings not being reliably returned in alphabetical order in URICA but I seem to remember that AWAIRS had overcome this problem by using a unique perhaps numeric index for each heading. Finally it was extremely compact and efficient to operate. I remember running entire libraries of perhaps 18 to 25 terminals on an IBM server machine that had a grand total of 1 MB of random access memory. Notable but less important was the representation of dates which meant that Pick would never have had any problems with the Y2K issue. The dates were stored as a number of positive or negative related to a specific date in the 1970s.

AWA computers in New Zealand has disappeared and while the Australian company still exists none of this history (which was for about a decade quite significant) is recorded in their organisational history. http://www.awa.com.au/about-awa/our-history/
For a while in the early 90s the software really was ahead of the game in terms of the smart ways that the databases could be configured, manipulated and reported on. From reading the Wikipedia article about Pick it appears that time and again licensing disputes were at the base of ongoing problems which is a shame because, released into the open source environment it would have been an attractive application environment.

I wasn’t involved in the installation. I think that colleagues from Australia must have been drafted in for this purpose and although working in the IT industry had its advantages for someone newly arrived from the UK (traveling around the country, smart hotels, trips to Australia and to the University of the South Pacific in Fiji and the opportunity to meet a diverse set of interesting and committed librarian’s in all parts of the sector across New Zealand) working to contracts rather than needs and the commercial smoke and mirrors of sets selling didn’t fit well with me.

Having said that I would entirely concur with John that at the outset and for many years Tapuhi was able to manipulate information in a way that few other systems could manage. All the best with the exciting next steps
Jan Rivers