tl;dr

Exploring and Connecting Collections Through Multimodal Linking

[Warning this is a public draft and always in flux. Some sentences may not be appropriate for (syntactically and semantically) sensitive viewers. Other reminders are written for a very small audience of one, i.e. your truly.]

Introduction

This investigation looks closely at the affordances of multimodal models for exploring and linking museum collections. We collaborate with the communications team, and scrutinise a curated dataset of communications objects, selected from the Science Museum Group and National Museum Scotland. In total this collection comprises around 5132 objects (images and descriptions), not immensely large, but large enough for exploration and experimentation. It goes without saying that, time and resources permitting, we aims to scale up the technique we developed to the complete collection. Nothing we do here is specific to communication objects, and the methods we propose work at the level of a national collection.

We focus on using multimodal embeddings for search, as well as record linking. In both scenarios we tend to ask (a rather) basic question: what do we gain? Put differently what are the gains (and costs) when using multimodal models? What do we find that otherwise remains hidden in the database? What connections can we forge?

Overall Aims (tl;dr version)

What is it that we want to achieve?

Technical Objectives (tl;dr version)