Skip to content

GLAM-Workbench/trove-web-archives

Repository files navigation

trove-web-archives

CURRENT VERSION: v1.0.0

This repository includes information on finding, understanding, and using Pandora's collections of archived web pages.

Pandora has been selecting web sites and online resources for preservation since 1996. It has assembled a collection of more than 80,000 titles, organised into subjects and collections. The archived websites are now part of the Australian Web Archive (AWA), which combines the selected titles with broader domain harvests, and is searchable through Trove. However, Pandora's curated collections offer a useful entry point for researchers trying to find web sites relating to particular topics or events.

The Web Archives section of the GLAM Workbench provides documentation, tools, and examples to help you work with data from a range of web archives, including the Australian Web Archive. The title urls obtained through Pandora can be used to obtain additional data from the AWA for analysis.

For more information and documentation see the Trove web archive collections (Pandora) section of the GLAM Workbench.

Notebooks

Associated datasets


Created by Tim Sherratt for the GLAM Workbench