Survey data and analysis

< Introduction and contextConclusions and recommendations >

7. Survey data and analysis

All percentages given in the following discussion have been calculated from the total number of survey respondents (290) regardless of the number of respondents who answered the questions and rounded to the nearest whole number. The percentages shown in the graphs and charts have been generated from the number of respondents who answered the specific question and this varies as respondents were able to skip questions. In most cases the difference is slight.

7.1 Demographic

The purpose of these questions was to establish the respondent’s subject interest and career position in order to explore trends that may develop later on in the survey. The survey was sent to disproportionally more individuals in education and this is reflected in the response rate (nearly a quarter of the total survey recipients were from education).

Chart: Do you consider yourself early, mid, or late career? Early, 24%; Mid, 45%; Late, 31%.
Figure 1: Overall respondent career stage.

The highest numbers of respondents came from the disciplines of:

  • Education (22%)
  • Literature (10%)
  • Economics (7%)
  • History (6%)
  • Other, specified by comments (15%)

The remaining respondents were scattered across disciplines. The full list can be found in appendix 2 below.

The Library was curious to see whether there was a relationship between the career positioning of academics and the answers to the survey. The survey tool provided the ability to filter results based on these demographics. Figure 1 shows an even spread of career stage across the respondents. This was reflected in the results across the survey, suggesting there is little difference between awareness and use of web archives and career positioning.

7.2 Awareness of archived websites

International vs local awareness

The purpose of these questions was to get a general indication of the awareness levels of web archives by New Zealand researchers.

Chart: Did you know that there are international initiatives to archive copies of websites and blogs to support research? Yes, 39%; No, 61%.
Figure 2: Awareness of international web archives.
Chart: Did you know that copies of New Zealand websites and blogs are archived by NLNZ and made accessible to support research? Yes, 23%; No, 77%.
Figure 3: Awareness of local web archives.

Figures 2 and 3 show the relationship between awareness of international and New Zealand web archiving initiatives.

  • 113 respondents (39%) were aware of the international initiatives concerning web archives
  • 67 respondents (23%) were aware of the New Zealand Web Archive
  • 59 of the respondents (20%) were aware of both

The gap between awareness of international and New Zealand web archiving initiatives is possibly due to a combination of factors:

  • The relative high profile of the Internet Archive (6)
  • NLNZ has done little to promote its web collection
  • The New Zealand Web Archive is integrated into the search and discovery of all other collection items held by the Library and does not have a separate profile or branch, like the collections in Australia, the UK, and USA do

Despite the lack of promotion or branding on the New Zealand Web Archive, it is encouraging that 23% are still aware of the collection. The 77% percent gap represents an opportunity for promotion to potential users of the web collection.

There was a slight discrepancy between the number of respondents who indicated that they were aware of international web archiving initiatives (113) and those discussed in section 7.3 who had used an international source of archived websites (128). This discrepancy is likely due to the broad wording of question three, and it does not affect the subsequent survey results.

7.3 Use of Archived Websites for Research

The Library knows that there is academic use of archived websites. Our analytics revealed that in 2013 we experienced a significant spike in access to certain archived websites. So much so that for that calendar year, three archived websites were the most highly accessed collection items in the entire Library. Further research revealed that these three archived websites were included in the reading lists for a range of university courses. The research attempted to get quantitative data on the numbers and percentages of academics that had used the New Zealand Web Archive for their research.

Some issues with the data

However, much of the data for the 'have you used?' questions is inconsistent and therefore inconclusive, but not all of it. Before discussing the valuable data from this section, it is first necessary to explain the inconsistencies and issues.

  • 30 respondents (10%) answered yes to question 5: ‘Have you used the NLNZ website or catalogue to find archived websites?’
  • 116 respondents (40%) indicated they had used NLNZ as a source for archived websites, when answering question 6 ‘how often do you use the following sources of archived websites as part of your research?’, and being presented with options for NLNZ, Internet Archive, National Library of Australia, UK Web Archive, and the US Library of Congress Web Archive (figure 6).
  • Comments from question 6 included references to Papers Past (New Zealand’s online newspaper archive), Google scholar and other non-web archive resources, suggesting that respondents got confused between an archived website, and archived items that are available via the web.
  • The Library recognised the possibility of this confusion and provided definitions and links to the external sources within the questions to try and reinforce what an archived website is.

Possible reasons for the apparent confusion over what an archived website is, when applied to these two questions, include:

  • The paucity of promotion to academics on what the New Zealand Web Archive is; how it can be accessed; and the possibilities for using archived websites as a research resource.
  • The Library does not provide its web collection as a separate resource, like other countries. (7) Rather, access to the New Zealand Web Archive is made through the catalogue search along with almost all of the other resources of the Library, via the main Library website. It is perhaps understandable that researchers are confused between archived websites and archived collections available via the website.

The difficulty in defining and explaining a web archive was also encountered by the BnF in their research in 2012 and the participants of the Jisc research in 2010. The BnF found that the confusion amongst their researchers was due to the existence of an ‘archives’ section in many websites or blogs and the history of changes on websites such as Wikipedia.

This research has perhaps uncovered that there is a case for a separate ‘New Zealand Web Archive’ presence on the web, where the collection can be promoted, accessed and used.

Useful data on use of the New Zealand Web Archive

Despite the inconsistencies and confusion articulated above, there is still some useful analysis and conclusions that can be made from the data.

Figure 4 is filtered to show the usage rates of the New Zealand Web Archive by those 67 respondents who had previously stated in question four, that yes they were aware of the New Zealand Web Archive. We used the filters to remove answers from respondents who appeared confused and came up with 21 respondents who knew about the New Zealand Web Archive AND had used it. This represents only 7% of the total survey population. A further 39 respondents (13%) knew about the New Zealand Web Archive but HAD NOT used it.

  • Many respondents demonstrated confusion between an archived website, and archived items that are available via the web.
  • With filters applied, 21 respondents (7%) can be reliably found to be aware of and have used the New Zealand Web Archive.
Chart: Have you used the NLNZ website or catalogue to find archived websites? Yes, 21%; No, 39%.
Figure 4: Usage of the New Zealand Web Archive by those who knew about it.

New Zealand use of international web archives

The Library was also interested to find out the extent to which researchers use web archives outside of New Zealand. Figures 5 and 6 show an active use of web archives from sources other than the New Zealand Web Archive. The NLNZ stats for figure 5 are not useful, given the confusion over accessing archived content via the National Library’s website. However, the other sources cited provide no confusion. While the Internet Archive has some additional content, the others are exclusively sources for archived websites.

  • 88 respondents (30%) stated that they used the Internet Archive at some point, including 14 (5%) using it often
  • 66 respondents (23%) stated that they used the National Library of Australia’s Web Archive at some point
  • 60 respondents (21%) said they used the US Library of Congress Web Archive
  • And 44 (15%) said that they used the UK Web Archive at some point.

In total 128 respondents (44%) indicated that at some point they used one of the international web archives that we provided as options. This figure is very encouraging and gives an indication of the potential number researchers who could use the New Zealand Web Archive.

Figure 6 further demonstrates this point by showing that of the 161 respondents (56%) who have never used the NLNZ web collection; there are still a number of academics exclusively sourcing archived websites for their research, outside of New Zealand.

  • 29 of these respondents had used the Internet Archive (10%)
  • 16 had used the US Library of Congress Web Archive (6%)
  • 11 had used the National Library of Australia (4%)
  • And 7 had used the UK Web Archive (2%)

Again, there can be no confusion for this data, as they relate to sites that almost exclusively provide archived websites. These results are encouraging. It shows some level of demand and use for archived websites as a resource for research.

Chart: How often do you use the following sources of archived websites as part of your research? NLNZ is used most, followed by Internet Archive, US Library of Congress Web Archive, National Library of Australia, and UK Web Archive.
Figure 5: Total usage by all respondents.
Chart: How often do you use the following sources of archived websites as part of your research? Filtered to those who have never used NLNZ. Small percentages have used the other archives, particularly the Internet Archive.
Filtered responses showing those who have never used the NLNZ website as a source of archived websites.

What is not clear is whether academics are going to places like the Internet Archive for content they could otherwise get from the NLNZ web collection. While there is some duplication of content between NLNZ and the Internet Archive, NLNZ selective harvesting tends to be a deeper, and is therefore more likely to be a complete harvest of the website than the shallower, broader harvesting approach of the Internet Archive.

These results again point to a lack of awareness of the New Zealand Web Archive, or alternatively show that there are gaps in New Zealand Web Archive that are preventing researchers from using it.

The results also raise the question of whether there is value in combining, through aggregation or some other model, the web collections of various jurisdictions. Further research could identify the information seeking behaviour of academics looking for web archives and whether the same search is being conducted across all available web archive services.

“Now I am aware of the NLNZ resource and the LOC and UK ones, I may use them, though the Internet Archive seems to go back further in time and be the most comprehensive.”

7.4 Searching and Accessing Archived Websites

This is the first piece of research investigating how researchers want to access the New Zealand Web Archive. Currently, websites selectively archived by the Library can be searched in the catalogue record by keyword, title, subject or name searches. The archived website is then viewed in a browser via a link in the catalogue record.

For this survey section the respondents were automatically separated based on their answers to question five (section 7.3) regarding whether they had previously used the Library website or catalogue to find archived websites, or not. This was in order to identify any differences between the needs of actual and potential users of the web collection. Those who had not used the web collection before were understandably more uncertain than those who had. However, on the whole the difference between the two user groups was not compelling; this is shown below in figures 7 and 8.

Chart: To what extent do you agree with the following statements on how to find content in archived websites? Data available in table below.
Figure 7: Respondents with experience using the New Zealand Web Archive.
Question 7 Strongly agree Agree Disagree Strongly disagree Don't know Total
I would prefer to find archived websites as part of my search for any item in the NLNZ collections (current approach) 26% (11) 43% (18) 10% (4) 0% (0) 21% (9) 42
I would prefer to be given archived websites as a data set that I could search using my own research tools (e.g. data mining) 10% (4) 44% (18) 20% (8) 2% (1) 24% (10) 41
I would prefer a full text search across all archived websites (e.g. like Papers Past) 31% (13) 62% (26) 0% (0) 0% (0) 7% (3) 42
I would prefer subject collections of archived websites to be available for my specific research or teaching area 32% (13) 46% (19) 7% (3) 0% (0) 15% (6) 41
I would prefer to go directly from a live website to archived versions of that site 24% (10) 40% (17) 17% (7) 0% (0) 19% (19) 42
For items collected by NLNZ, I would prefer to find them all together in a dedicated NZ archived website collection 29% (12) 40% (17) 5% (2) 0% (0) 26% (11) 42
I would prefer to use a URL search in the NLNZ catalogue to find archived websites 12% (5) 24% (10) 21% (9) 10% (4) 33% (14) 42
Chart: If you were to access content in archived websites at NLNZ, to what extent do you agree with the following statements? Data available in table below.
Figure 8: Respondents who have not used the New Zealand Web Archive before.
Question 7 Strongly agree Agree Disagree Strongly disagree Don't know Total
I would prefer to find archived websites as part of my search for any item in the NLNZ collections (current approach) 21% (45) 46% (99) 8% (18) 1% (3) 23% (50) 215
I would prefer to be given archived websites as a data set that I could search using my own research tools (e.g. data mining) 12% (25) 32% (69) 16% (34) 3% (6) 38% (81) 215
I would prefer a full text search across all archived websites (e.g. like Papers Past) 29% (63) 49% (108) 2% (4) 0% (1) 20% (43) 219
I would prefer subject collections of archived websites to be available for my specific research or teaching area 20% (42) 46% (99) 10% (21) 3% (6) 21% (46) 214
I would prefer to go directly from a live website to archived versions of that site 22% (47) 39% (83) 9% (19) 2% (5) 28% (60) 214
For items collected by NLNZ, I would prefer to find them all together in a dedicated NZ archived website collection 18% (38) 44% (95) 9% (19) 1% (3) 29% (62) 217
I would prefer to use a URL search in the NLNZ catalogue to find archived websites 7% (16) 23% (50) 19% (40) 2% (5) 49% (105) 216

Both user groups were strongly in favour of using a full text search across all archived websites.

  • 171 respondents who had not used the New Zealand Web Archive agreed or strongly agreed to a preference for a full text search.
  • 39 respondents who had experience using the New Zealand Web Archive agreed or strongly agree to a preference for a full text search.
  • In total 72% of the survey respondents showed a preference for full text searching.
  • Overall, only 5 respondents disagreed with a full text search, making up 2% of the total survey population.
  • No one with experience using the New Zealand Web Archive disagreed with a full text search.

The respondents also indicated a desire for full text searching later on in the survey regarding access to the domain harvests (see section 7.8). The favour of full text search access to web archives is also reflected internationally. The participants in the Netherland’s study preferred full text searches and regarded searches restricted to URL alone as a limitation. (8) While the Portuguese study held that the prevalence of Google has resulted in an expectation for the availability of full text searching on the internet. (9)

In the live web, search engines rank and display results according to specific algorithms and assumptions about the user’s needs, however, a full text search in a library catalogue does not (and should not) make these assumptions. This difference can result in ‘conflicting expectations’ about the presentation of full text search results in a web archive. (10)

The findings show that there is demand for full text searching and that the current access the Library provides is limiting our researchers. This is an opportunity for the Library to modify the current catalogue record which does not reflect the wants of researchers who desire full text indexing of the website itself.

The least popular method of finding content in archived websites is by a URL search in the Library catalogue.

  • 66 respondents who had not used the New Zealand Web Archive agreed or strongly agreed to a preference for a URL search.
  • 15 respondents who had experience using the New Zealand Web Archive agreed or strongly agreed to a preference for a URL search.
  • In total 28% of the survey respondents showed a preference for a URL search.
  • URL searching also had the most disagree or strongly disagree responses answers from of all of the given options.
  • Proportionally the URL search method was rejected more strongly by those who had experience searching the New Zealand Web Archive.

These finding showed similarities to the Portuguese study which also found full text searching to be preferred over URL, but with URL searching still considered a somewhat popular means of accessing their web archives. 28% of respondents are still a moderate number of researchers who would prefer this method of access. This was also the response that had the largest number of respondents who did not know. Perhaps the findings show that researchers do not want to be limited by the search and access methods, or that some are simply used to the Internet Archive which requires a URL search.

“The more ways of finding and using these resources the better.”

7.5 Using Archived Websites for Teaching

The Library’s analytics show that there are several tertiary courses using archived websites as a teaching resource and, as a result, the library sought more information about how and why archived websites are being used in a learning environment. The Library also wanted to know what barriers are being faced by those wanting to use archived websites in their teaching in order to assist the use of their web collection in teaching and learning in New Zealand.

Some issues with the data

As with section 7.3, there are also some inconsistencies in the data for this question due to confusion around using the NLNZ website to access archived websites. The Library therefore decided to filter out any data concerning the New Zealand Web Archive, and make the focus of this question to focus entirely on the use of international web archives in teaching and learning in New Zealand.

  • 128 respondents indicated that at some point they used any one of the international web archives that we provided as options (National Library of Australia, U.K. Web Archive, U.S. Library of Congress Web Archive and the Internet Archive).
  • While the Internet Archive has some additional material the other options are exclusively sources of archived websites and therefore we are confident of the results that they show regarding the use of archived websites in teaching.

The use of international web archives in teaching in New Zealand

As previously mentioned in this report, the three highest access items of the Library’s entire collections in 2013 were archived websites. Furthermore, the Library is aware that archived websites are being used in teaching in New Zealand. However even when disregarding all responses for the New Zealand Web Archive, we discovered that:

  • 43 respondents have used archived websites or blogs from international sources as a resource in their teaching.
  • This means that 34% of the respondents who use international web archives are using them as a resource in their teaching.
  • In total these 43 respondents represent 15% of the total survey population.

The Library is pleased with this finding and is encouraged to see such a high level of engagement with web archives in a learning environment. The Library also was interested to find out how archived websites are used as a resource in teaching. The 43 respondents who had used international web archives in their teaching were asked two further questions regarding how and why they use this resource.

  • 31 respondents said that they use archived websites in their teaching by providing a link to archived websites.
  • 26 respondents said they used screenshot of archived websites in their presentations.
  • And 16 respondents said that they include archived websites in the reading lists that they provide to their students.

These findings show quantitative evidence of the use of archived websites in New Zealand classrooms. These 43 respondents were also asked why they used archived websites in their teaching.

  • 30 said that they use archived websites in their teaching because they want their students to use a wide range of resources.
  • 24 said it was because archived websites contain the only content they can find on a topic (as the content is no longer available in print or online).

Some of the comments from this question were particularly insightful:

“Important for students to find ways to access ‘invisible’ materials, to see what has come before, and to recognise that failing to store or archive materials used in research may be affected when such pages disappear (often). Also useful to recognise sites that change or delete information that may be unflattering.”

“To show how available information changes given political and economic contexts; to demonstrate the fluidity and malleability of online content, and to encourage students to consider how information placed online is bother ephemeral and long-lasting; to show how (for instance) a particular organisation might change its content to reflect popular or institutional attitudes towards things or in reaction to criticism; to show how information is sometimes ‘hidden’ or taken out of the public domain.”

“Sometimes an abandoned site offers a particularly dated - but still popular – interpretation of a specific data set. This helps students hone their analytical skills by letting them sift out the nonsense.”

The answers from these 43 respondents regarding how and why they use archived websites in their teaching are enlightening and certainly of interest to the Library. The respondents express a real value in the role archived websites play in their teaching and the learning of their students.

All of the respondents were asked what the barriers and frustrations were to using archived websites in their teaching. The Library has filtered these responses to show only the responses of the 128 respondents who had used international web archives.

  • Uncertainty regarding copyright and what to do with an archived website was the biggest barrier to using archived websites, it was agreed upon by 53 respondents.
  • The next biggest barrier or frustration, agreed by 52 respondents was that archived websites relevant to their course are difficult or time consuming to find.
  • Two comments mentioned that university processes requiring copyright clearances were a barrier to their use.

Overall these results show that there are some significant barriers to the use of archived websites in teaching and there are some improvements that could be made in delivering the service.

When considering that 174 of the total survey respondents had no idea of the international initiatives concerning web archives, and 219 respondents did not know that the Library archives New Zealand websites, it is clear that the biggest barrier to the use of archived websites in teaching is awareness of their existence.

The use of international web archives as a resource in teaching in New Zealand is indicative that the New Zealand Web Archive has the potential to be of as much or even more use in a New Zealand learning environment.

7.6 Content of the New Zealand Web Archive

It is currently not possible or practical to collect every website, blog or social media account relevant to New Zealand, so the Library prioritises certain subjects or themes. The purpose of this section was to seek feedback on the Library’s priorities and identify any gaps.

Chart: How important are the existing NZ Web Collection subject priorities to you in your research? Government ranks highly, as does History, Māori, Events, and Community Groups. Music and Sport and Leisure rank low.
Figure 9: Respondent ranking of the New Zealand Web Archive subject priorities.

All subject priorities received some acknowledgement of importance from respondents. Comments highlighted the potential gap in the collection to support research relating to commercial activity, e.g. online purchasing or commercial websites. The Library does collect some this web content as part of the bi-annual domain harvests (see section 7.8), however these finding suggest some demand outside of current collecting activity.

It is interesting that 191 respondents (66%) considered government archived websites either somewhat or very important to their research. Information online from government is created as a record, covered by the Public Records Act 2005, and is likely to surface in 20-25 years’ time at Archives New Zealand, via the public archive process – as a record, not as a web archive. However, it is clear there is a research demand for such information now. Does this research suggest that the web is shortening expectations of availability of public information through archives? Further research and analysis is required to explore the extent to which web archives are shifting the relationship between research libraries and public archives through and what is collected when, by whom and for what purpose.

“The 2008 national led government removed a number of significant websites in their first year. These websites held research and other valuable documents that don’t exist elsewhere. Even the TEAC site was archived by the labour led government. As policy changes it is important for the government to present themselves as they want. But key decisions will be lost to the mists of time if these websites are not preserved somewhere.”

7.7 The Importance of Social Media

An increasing amount of social, political, and cultural interactions are occurring on public social media channels. The Library is currently investigating methods for collecting social media and was seeking an indication of the research value of archived social media.

Chart: Would archives of social media be a useful resource for your current or future research? Yes, 114; No, 47; Don't know, 105.
Figure 10: Respondent acknowledgment of the importance of social media archives.

114 respondents believed that archives of social media would be useful to their current or future research. This figure is much higher than anticipated, and provides confidence that the Library can proceed with its social media archive plans with knowledge of a known demand from the academic community.

This section overwhelmingly had the most comments of the entire survey (77) which reveals the interest in this topic by researchers. The general trend of the comments was positive and encouraging or else reflected uncertainty about the process. There were few comments suggesting that social media content is not useful for research. Representative examples include:

“Social media is increasingly a research focus and to be able to access material that is no longer available, especially on controversial topics (such as Treaty issues, racism, sexism, etc.) would be very valuable.”

“I am interested in how social media are used by Māori, and the impact of this on tradition tikanga. I am also interested in how social media is used to comment on news.”

“From my perspective it would certainly be very useful to archive the social media activities of public figures, such as politicians, ministries, political parties, lobbyists, NGOs, etc. as this forms an increasingly important part of their strategy of public communications and dissemination of information.”

“There is some potential for value here, but also great potential for a load of rubbish to be collected, and searching through that would be fraught with problems of (a) finding stuff, and (b) knowing whether it had any credibility.”

International research on archiving social media has been limited. The BnF research stated that material shared on social media “leaves the domain of publication and becomes more that of conversation,” and that it is ‘improper’ to archive much of the information held on social media as often it is the trace of actions could be easily performed in a street or shop. (11)

It is possible that researcher awareness of the value of social media archives has changed in only three years. It is also possible that the New Zealand context, where evidence of social media interaction had significant media coverage in a recent general election, is different to the French context of 2012. Differences in contexts and methodologies make it difficult to compare researcher demand across the two studies; however the data in New Zealand reflects an encouraging demand for archives of social media as a research resource.

Demand for different types of social media

Respondents who answered that an archive of social media will be useful to their current or future research, and those who did not know, were prompted to an additional question on the specific types of social media that they would find valuable to be archived.

  • Video channels were clearly considered the most important medium of social media to be archived, with 166 respondents agreeing or strongly agreeing.
  • The second most important medium of social media to archive was discussion forums which was agreed and strongly agreed by 124 respondents.
Chart: How much do you agree or disagree with the following statements? 'Archived, publicly available (blank) will be important to my current or future research'. Data available in table below.
Figure 11: Respondent preference for archives of specific forms of social media.
Question 18 Strongly agree Agree Disagree Strongly disagree Total
Personal identity channels (e.g. Facebook, Myspace, Linkedin) 9% (19) 40% (84) 41% (86) 11% (23) 212
Microblogging channels (e.g. Twitter) 8% (16) 40% (84) 42% (88) 10% (22) 210
Video channels (e.g. YouTube, Vimeo) 28% (59) 50% (107) 18% (18) 4% (9) 213
Photo sharing channels (e.g. Instagram, Flickr) 11% (23) 34% (70) 47% (97) 9% (18) 208
Audio channels (e.g. Bandcamp, Soundcloud) 8% (16) 32% (66) 48% (98) 12% (25) 205
Discussion forums (e.g. 4chan, Reddit) 9% (18) 51% (106) 34% (70) 6% (13) 207

The Library was slightly surprised that video channels such as YouTube and Vimeo clearly outranked all other mediums in terms of importance to researchers. Collecting online video content comes with significant resource and legal challenges that can prohibit collecting, however this demand by researchers shows that the Library are justified in their attempts to work through these challenges.

Another interesting finding from this section was how highly the research community valued discussion forums. Discussion forums were ranked the second most important after video channels with 124 respondents agreeing or strongly agreeing that they will be important to their current or future research. Currently the Library is collecting discussion on sites such as theatre reviews, specialist sport and recreation communities (e.g. climbing magazines) and gamer forums. These communities are primarily online only communities and therefore, are difficult to document in any other way.

Aside from documenting communities, discussion forums offer the opportunity to document contemporary social phenomena, like trolls or shaming among others, for researchers to further understand the impact of the Internet on New Zealand culture. From a technical perspective, discussion forums are difficult to harvest due to the dynamic nature of the content. It is very encouraging to receive quantitative evidence that discussion forums are considered to be of such potential value to researchers. This is an area the Library will explore in more detail.

Responses regarding the value of micro blogging and personal identity channels were almost evenly divided between those for and those against. The Library is unsure why half the respondents were opposed to the idea of archiving micro blogs and personal identity channels. Possible reasons discussed in the research team included; little precedent of academic uses of these channels; the tools are not yet easily accessible for the computational methods required to research these sources; or maybe, contemporary uncertainty on the relationship between access to perceived personal correspondence and privacy concerns – even for publically available information – are being reflected in these answers.

Archiving social media is a new activity, underpinned by only a couple of known research projects. In their 2012 project, the BnF question the value of archiving social media, citing difficulty distinguishing between the public, or ‘published’ domain, and the private personas and conversations of individuals, with the latter having questionable value for archival purposes due it resembling more of a personal conversation. (12) However, the 2010 Jisc paper promotes the value of social media archives, mentioning the U.S. Library of Congress Twitter archive and an add-on which can create a personal Facebook archive. (13)

Collecting social media is a new area, with only a limited number of known programmes in place. While the Library previously considered that archiving the social media of today was for the benefit of future researchers, there’s some evidence by our respondents to suggest that the demand is now.

7.8 Domain harvests

NLNZ regularly collects copies of the New Zealand internet from the .nz domain as part of the Whole of Domain web harvest programme. The Library has sets collected from 2008, 2010, 2013 and the 2015 set is being collected January 2015. Each set contains at least 150 million pages, and the whole collection comprises over 30TB of data, approximately 1 billion files, and over 600 million URL captures. The Library wants to provide access to these data sets online and sought information in this section that would assist this process. Generally we were pleased that so many people answered the domain harvest question considering the technicality of it.

Chart: How would you prefer to access the domain harvest collections? Data available in table below.
Figure 12: Respondents preferred access to the domain harvest collections.
Question 19 Strongly agree Agree Disagree Strongly disagree Don't know Total
I would prefer access through an online full text search 31% (78) 41% (105) 2% (5) 0% (1) 26% (66) 255
I would prefer to access the full datasets 11% (25) 30% (69) 15% (34) 1% (3) 42% (96) 227

The popularity of full text searching was demonstrated here as well.

  • 183 respondents (63%) indicated that they would prefer access to the domain harvests by a full text search.
  • While only 6 respondents disagreed or strongly disagreed with this method.

Full text search capability has become an established norm in research and something that academics have come to expect. This feedback expresses the view that when the Library is able to make this information available, it needs to be in a database that people can actually use.

  • 96 respondents stated that they did not know if they would prefer to access the domain harvest collections by the full data sets.

This number is unsurprising given that computational analysis is new in the humanities and social science disciplines, and this question was targeted at individuals (such as those in digital humanities) who would understand the potential of access to the full datasets.

Overall, what this section showed was that the New Zealand domain harvests hold an unrealised potential for New Zealand researchers and the Library needs to put in some work to make the domain harvests accessible.

7.9 The research value of archived websites

Generally the Library archives items for long term future use. However, the results of this section show that researchers want immediate access to archived websites, blogs, social media, domain harvests and other archived web content and, researchers see this as being a valuable and important current and future resource.

  • 211 respondents (73%) believe it is important for New Zealand websites and blogs to be archived.
  • 6 respondents (2%) disagree.
  • And 42 respondents (14%) do not know.
Chart: Is it important for New Zealand websites and blogs to be archived? Yes, 211; No, 6; Don't know, 42.
Figure 13: Acknowledging the importance of archiving New Zealand websites and blogs.

“Websites reveal the social currents and preoccupations of their day.”

Throughout the survey researchers from some disciplines (such as ancient history) noted that their subject area could not benefit from the use and availability of archived websites. The data from this question shows that the respondents value archives of New Zealand websites, regardless of whether they are personally useful for their research or not.

Our research shows that there is presently a need by New Zealand for the Library’s collection of archived websites.

  • 147 respondents (51%) indicated that our collection of archived websites is important for their current research within the next five years.

This number is in contrast to our expectations and the BnF research which found that while their researchers did not have an immediate need for web archives themselves, they thought it would be of use to their students. (14)

Chart: Is the NLNZ collection of archived websites important for current, medium, or long term future research in your discipline? Half consider important for long term research, slightly more for medium term or current research.
Figure 14: The important of archived websites to research over time.
Chart: Could the NLNZ domain harvests be important for current, medium, or long term future research in your field? Half consider important for long term research, slightly more for medium term or current research, though less than in the previous question.
Figure 15: The important of domain harvests to research over time.

The final comments and feedback that we received at the end of the survey were largely very positive. Most respondents were grateful to have been made aware of this service and many said that they will think about using the Library’s Web Archive in the future:

“I did not know this resource existed, but I am excited to delve into it.”

“I think it’s important to get the word out to researchers that these services exist. I wasn’t aware of them prior to this survey.”

< Introduction and contextConclusions and recommendations >

Citations

  1. Dougherty et al. Researcher Engagement with Web Archives: State of the Art, p. 14. ^
  1. See Australia (pandora.nla.gov.au); the UK (webarchive.org.uk/ukwa); and the USA (lcweb2.loc.gov/diglib/lcwa/html/lcwa-home.html). ^
  1. Ras and van Bussel. Web Archiving User Survey, p. 14, 18. ^
  1. Costa and Silva. "Understanding the Information Needs of Web Archive Users", p. 9-10. ^
  1. Andrew Jackson. "Building a 'Historical Search Engine' is no easy thing", UK Web Archive Blog, 19 February 2015. ^
  1. Stirling and Chevallier. "Web Archives for Researchers: Representations, Expectations and Potential Uses", Section 3. ^
  1. Stirling and Chevallier. Section 3. ^
  1. Dougherty et al. Researcher Engagement with Web Archives: State of the Art. p.14, 20. ^
  1. Stirling and Chevallier. Section 2. ^