Preserving digital objects
Technical information about how we store and preserve born digital and digitised items in the National Digital Heritage Archive.
National Digital Heritage Archive
Born digital and digitised items created or collected by the Library are held in the National Digital Heritage Archive (NDHA), our system for long-term digital preservation. It uses the ExLibris Rosetta application.
This page provides the technical details of how the NDHA maintains object integrity over time, preventing the introduction of errors into the digital files we hold and how we provide appropriate access.
Processing and maintenance
When materials are deposited into the NDHA, the system uses checksum technologies at several stages to ensure the integrity of the materials as they are being processed. Checksum is also used on an ongoing basis to help us avoid file degradation.
The NDHA primarily uses three types of checksums: MD5, SHA-1, and CRC32.
Stage 1: At point of deposit
Unless already provided by the depositor, the NDHA application calculates checksums (MD5) when the files are being uploaded to our servers. The checksums are stored together with the materials to allow comparison as processing continues.
Stage 2: Validation
As the materials pass through the next stage, the system generates the three checksums (MD5, SHA-1, and CRC32) for each file. The MD5 checksums obtained in the deposit stage are compared to the new MD5 checksum. If the system finds any inconsistencies, it routes the materials to a specialised area, and NDHA staff alert the relevant National Library staff.
Stage 3: Permanent repository
When the materials are moved into our permanent repository, the system confirms that they have been successfully stored by comparing the MD5 checksum again. The 3 checksums generated during validation are also captured within a metadata file. That metadata file is stored together with the materials in our permanent repository, and the MD5 checksum for each file is stored in the database. This allows for a three way validation of the digital collections through the database, metadata files and the actual files.
The NDHA performs an annual checksum comparison for all files stored in the permanent repository, including the metadata files. Any issues found are reported back to the relevant National Library business units, with any remedial solution signed off and all information documented and archived securely within the Library’s electronic records management system. A provenance note is also included in the object's metadata describing the issue and any mitigation undertaken.
All NDHA storage areas are subjected to daily, weekly, and monthly backup. Our permanent repository is backed up continuously, with multiple copies of the material backed up onto both disk and tapes, some of which are sent offsite to be stored securely.
During the continuous backup, our storage system also uses checksums to verify that all copies are made successfully, and the checksums are stored as metadata inside the storage system.
Access and delivery
All objects sent to our permanent repository have access policies associated with them. They can be based on an IP range, a particular group of people, or restricted to a certain number of people. These access policies are enforced whenever anyone (including staff members) requests to see these digital objects.
The NDHA application is only available to a controlled list of staff members. Our system backend, including servers and physical storage, is available only to a handful of staff members looking after our system and infrastructure.
Get in touch if you have questions about how we store born digital and digitised items or legal deposit.