AI for Cultural Heritage Hub (ArCH)

a black and white photo of a geometric object

Hanna Barakat & Cambridge Diversity Fund from Better Images of AI

Hanna Barakat & Cambridge Diversity Fund from Better Images of AI

Cambridge’s GLAM institutions (galleries, libraries, archives, garden and museums) house millions of objects from across the globe, representing an unparalleled repository of cultural and natural history. However, challenges such as analogue formats, handwritten records, fragmented objects, multilingual sources and complex surfaces make much of this data difficult to access.

To address these challenges, the AI for Cultural Heritage Hub (ArCH) will deploy the convening power of Cambridge’s distributed network of collections to create a secure workspace and Community of Practice to empower non-technical users (practitioners and academics) to analyse cultural heritage data securely with AI tools.

By encouraging collaboration between curators, researchers, IT professionals, and AI experts, the new hub will prototype adaptive AI solutions to enhance understanding of collections and identify a selection of AI tools to address these challenges.

This project is funded by ai@cam and the Accelerate Programme for Scientific Discovery, made possible by a donation from Schmidt Sciences

It is led by the Cambridge University Library Research Institute, in collaboration with the Department of Mathematics and Theoretical Physics and the Collections, Connections, Communities Strategic Research Initiative at the University of Cambridge.

Addressing cultural heritage challenges

ArCH’s six case studies will test the ability of AI methodologies to address three cultural heritage challenges. 

Solving these challenges will serve researchers and wider society by benefitting cultural heritage practitioners, expert users and all those engaging with cultural heritage.

Challenge 1: Unlocking inaccessible data

Three of the case studies will address the challenge of unlocking inaccessible data by applying AI transcription and computer vision (CV) tools to digitised documents. 

Case Study 1: AI tools will be used to convert analogue Cambridge University Library catalogue cards into online records. This has the potential to make thousands of rare books and maps discoverable, a project that would otherwise take years.

Case Studies 2 and 3: Historical handwritten biodiversity records from the University Museum of Zoology registers and specimen labels from the University Herbarium will be turned into machine-readable datasets. As well as deepening our understanding of these collections, this has enormous potential for biological research and the nature-human interface.

Left: Handwritten register from the Museum of Zoology (UMZC 1867-1902 register). Right: Speciman from Cambridge University Herbarium (CGE00081874).

Left: Handwritten register from the Museum of Zoology (UMZC 1867-1902 register). Right: Speciman from Cambridge University Herbarium (CGE00081874).

Catalogue records © Alice the Camera / Cambridge University Library

Catalogue records © Alice the Camera / Cambridge University Library

Challenge 2: Reconstructing fragmentary or dispersed cultural objects

Two further case studies will investigate how AI can assist with the reconstruction of fragmentary or dispersed cultural objects, to transform our understanding of them and their context.

Case study 4: This case study will test the ability of AI tools to reconstruct the position of unplaced papyrus fragments from the Book of the Dead of Ramose, an ancient document held at the Fitzwilliam Museum, by analysing fibre patterns.

Case Study 5: This case study investigates the potential of machine learning (ML) and computer vision tools to fill in missing text and analyse Mesoamerican symbols found in a sixteenth-century Nahuatl-Latin lectionary held in the Bible Society Collection at Cambridge University Library.

Microscopic study of the white material covering text in Mesoamerican lexicon (CUL BFBS Ms 375, f. 156v). Photographed by Flavia Fiorillo, CUL Centre for Cultural Heritage, January 2025.

Microscopic study of the white material covering text in Mesoamerican lexicon (CUL BFBS Ms 375, f. 156v). Photographed by Flavia Fiorillo, CUL Centre for Cultural Heritage, January 2025.

Some of the unplaced vignette fragments from the Papyrus of Ramose (P. Cambridge E.2.1922).  Photographs by Joel Sams © Fitzwilliam Museum Cambridge, 2024.

Some of the unplaced vignette fragments from the Papyrus of Ramose (P. Cambridge E.2.1922).  Photographs by Joel Sams © Fitzwilliam Museum Cambridge, 2024.

Challenge 3: Integrating expert cultural knowledge into AI algorithms

Case study 6 will investigate the use of LVM tools trained on small, bespoke datasets of specific types of cultural heritage artefacts, integrating expert, practitioner and community knowledge.

A detailed digital collage inspired by the aesthetic of medieval manuscript illustrations, depicts a vibrant scene of construction workers building a structure. The traditional elements, such as figures in traditional Persian gowns, wooden ladders, ropes, and woven sacks, are interwoven with modern technological motifs like circuit boards, QR codes, and cloud icons. Bright, glowing, golden networks of interconnected nodes and symbols of digital technology overlay the scene, blending the past and present. On the left, Arabic calligraphy is imprinted onto the paper. The composition uses contrasting textures and colours to juxtapose and interweave the themes and eras.

Shady Sharify from Better Images of AI

Shady Sharify from Better Images of AI

Engage with ArCH

Read the ai@cam blog post introducing the project and its aims with Project Lead Amelie Roper

Explore how and why the ArCH workspace was developed in this blog post by Lead Software Developer, Jennie Fletcher

Sign up to the ArCH mailing list to keep up-to-date with the project’s progress and for opportunities to engage with its work.

Follow along with the project's progress through related blog posts.

Watch the ArCH project team's presentation at the RLUK Digital Shift Forum (November 2025).

Project team

Dr Amelie Roper

Jennie Fletcher

Tuan Pham

Dr Suzanne Paul

Professor Sam Brockington

Dr Helen Strudwick

Wallace Peaslee

Huw Jones

Mathew Lowe

Dr Maya Indira Ganesh

Dr Irene Galandra

Dr Joshua Fitzgerald

Dr Anna Breger

Amparo Gimeno-Sanjuan