Alexander Turnbull Library: Harvested Twitter data relating to Te Matatini

Date
15 January - 27 February 2019
By
Alexander Turnbull Library
Reference
ATL-Group-00590
Description

Twitter crawl conducted by staff at the Alexander Turnbull Library relating to the Te Matatini kapa haka festival held in Te Whanganui-a-Tara (Wellington) from 21 - 24 February 2019.

This dataset contains 5,581 tweets related to the Te Matatini festival. They were collected between 15th January-27th February 2019 from the public Twitter API using Twarc.

The Te Matatini Society promotes Māori performing arts, including a biennial National Kapa Haka Festival. The 2019 festival was won by Ngā Tumanako of Tāmaki Makauru, with Te Pikikotuku o Ngāti Rongomai, and Te Kapa Haka o Te Whānau a Apanui placing second and third, respectively.

Quantity: 1 data set(s). 668 digital image(s). 5 Electronic document(s).

Processing information: The data from the crawls was combined and deduplicated. Images in the dataset were harvested separately by the Library using a Python script.

The Library unshortened shortened URLs in the Tweets, and conducted a WARC crawl to captured HTML pages referred to in the Tweets.

Additional description

JSON dataset requires computational methods for access (eg. Python and script editor). Access copies for some material are available as text and csv files.

Access restrictions
Partly restricted material - Some material only available in the Katherine Mansfield Reading Room.
Format
1 data set(s), 668 digital image(s), 5 Electronic document(s), Data sets, Social media
There are 4 items in total.
See original record

Click to request to view this item, access digital version (if available), and see more information.

Copyright

Unknown
There are 4 items in this group.
Online Other

ReadMe text file for the Twitter harvest relating to Te Matatini

Date: February 2019

From: Alexander Turnbull Library: Harvested Twitter data relating to Te Matatini

Reference: WADL-0036

Description: ReadMe text file documenting the Library's search criteria and tools used to generate the Library's Twitter harvest relating to the 2019 Te Matatini kapa haka festival. Quantity: 1 Electronic document(s).

Online Other

Tweet ID text file for the Twitter harvest relating to Te Matatini

Date: February 2019

From: Alexander Turnbull Library: Harvested Twitter data relating to Te Matatini

Reference: WADL-0037

Description: Text file containing a list of all Tweet IDs from the Library's Twitter crawl relating to Te Matatini in Feburary 2019. This dataset can be reconstituted (rehydrated) from the Twitter Developer API using the tweet IDs. Tools for doing so include Twarc, Social Feed Manager, and DocNow, all available on Github. Because tweets may be deleted or made private, hydrating from the tweet IDs may not produce the same dataset. Quantity: 1 Electronic document(s).

Other

Tweet objects JSON file and Twitter crawl hashed and unhashed text file for the Twitter...

Date: February 2019

From: Alexander Turnbull Library: Harvested Twitter data relating to Te Matatini

Reference: WADL-0039

Description: Tweet objects Java Script Object Notation (JSON) file for the Library's Twitter harvest relating to the 2019 Te Matatini kapa haka festival. JSON file generated by the Twitter API containing attribute data for tweets, including author, message, unique ID, timestamp, and geolocation data, if shared by the user, plus text file access copies of the data genereated by the Library, hashed and unhashed. Quantity: 3 Electronic document(s).

Other

Image files for the Twitter harvest relating to Te Matatini

Date: February 2019

From: Alexander Turnbull Library: Harvested Twitter data relating to Te Matatini

Reference: WADL-0038

Description: Individual image and media files from harvested Tweets from the Library's Twitter harvest relating to the 2019 Te Matatini kapa haka festival. Quantity: 668 digital image(s).