3-16 August | Notion

CONGRUENCE ENGINE/HERITAGE CONNECTOR MEETING

The Congruence Engine and Heritage Connector Meeting was extremely helpful to better understand the relationship between the two projects. I was impressed by the reflections from John, Jane, Kalyan and Jamie, particularly around the legacy and value (when we are linking collections, what value is sought and for whom? How can we design valuable interfaces?), input/output data (what other input data outside the catalogue data itself?) and training/participation (how we can map the skills needed in terms of technical, curatorial and historical knowledge? How much technical and how much human expertise is needed?). Interestingly, we noticed how we started to address some of the project reflections in Congruence Engine: we are exploring the connectivity of the data itself (and not only the metadata information contained in the catalogue description); and through the concept of the Social Machine – which we recognise as the key legacy of the project – we are exploring how to harness human capability and highly specialized technical knowledge, understanding what are the motivations for human-machine interaction. Looking forward to future discussions on this, reflecting on the two projects was extremely clarifying and prompted a series of key conversations which might impact our future work.

ORAL HISTORY INVESTIGATION

In the last Investigation Meeting on 7th August, we had a very productive discussion on the Oral History investigation, reflecting on to what extent what has been done can be understood as a proof of concept. We agreed that there is still a piece of work that needs to be done to link the Oral Histories with museum objects: this would require to include collection data into the vectorised space. Because the agreement between the UoL and the SM has not been signed yet, the easiest and quickest way to do this is to collect the former Snibston Museum collection data from Leicestershire County Museums, and I will explore this opportunity with Colin over the next weeks. We also reflected on the fact that the visual exploratory tool has been developed at a proof of concept phase, and there would be more work to do to bring it into a prototyping stage (by fine tuning the model and further co-designing the interface). We reflected on the importance to involve the Congruence Engine historians and museum partners at this stage, in order to get their insights on potential historical and museum applications and wrap up this phase of work. The investigation team is drafting a paper and reflecting on the historiographical and museological implications would be a substantial phase of our work. We then agreed to organize an extended in person session in early November in Leicester at the Institute for Digital Culture with some of the CE Co-Is, researchers and curators who were involved in the textile pilot to help finish off this aspect of the work.

Stef, Alex, Arran, Daniel and I had then a follow-up meeting last week to start planning the day, and I drafted a summary of the workshop to share with Arran and Tim by the end of this week.

FOLK SONGS INVESTIGATION

During this week drop in, I discussed with Kunika and Arran a series of questions that I have for the next stage of our work:

Is it possible to train the Machine Learning Model to recognise more than one non-traditional NER categories? (In the development of our annotation schema, Daniel and I developed 14 categories, some of them are not traditional NER entities)
How much data would we need to train the model?

Kunika suggested that the second question in particular is key at this stage, and, interestingly, emerged from the investigation itself: to train a generic model we know how much data we need, but to work with a different corpus (highly specialised and dialect-based) we don’t have this information, and we need to answer this question through the investigation itself. She suggested that we do a small experimentation of active training with the data we have to understand how much Label Studio can learn from a small corpus of annotated data. She said we can do it with the Colab version of LabelStudio: we would only need a group of non-annotated folk songs, and we could use Daniel’s mining corpora for this.

We also discussed on the fact that the textile and mining corpora have different features: the textile folk songs collection is a smaller sample which was created in close connection with Jennifer Reid’s knowledge and expertise on Lancashire dialect and textile industrial heritage, whereas Daniel’s mining corpus is more comprehensive in terms of geographical area (which was the aim of the Lloyd anthology itself). Consequently, the list of terms extracted from the mining songs is longer, but less dense in terms of annotation (the opposite for the textile corpus). Arran pointed out how that differences across datasets can enrich the investigation’s perspectives and that in this way we can look at different aspects: the textile corpus can contribute to understand the value of working with the Lancashire dialect and Jen’s comments can show a trajectory for future applications on other similar songs.

I also discussed with Kunika the opportunity to have an in-person drop-in in September and I provisionally booked Dana Study for Monday 18th September from 10 to 14 for an in-person session on OpenRefine. Kunika will share this with the team.

In parallel to the Drop in, I also had a very interesting chat with @Alex Fizpatrick who was particularly impressed by the potential of the folk songs investigation to be a starting point for future digital participation practices. We noticed how both the topic – folk songs – and the technique we are exploring – annotation – are both very well suited to extend the work we are doing with Jennifer, and how the human participation element of the investigation is already quite strong. I hope we can continue this conversation in the following months and bring Alex F. into this investigation more. It was also interesting to hear her background and first thoughts on the project: I feel very close to her approach and I hope we can have the opportunity to work together in the next months. Before Congruence Engine, I had been developed digital participatory practices in the Dolomites and I would love to think how we can bring this element more in our investigations.

OTHER TEXTILE INVESTIGATIONS

I had a very interesting chat with Alex Appleton sharing our work. I found the taxonomy of wool industry extremely helpful and I started imagining potential applications in connection with both the oral history and the folk songs work. I think we could potentially run a small reconciliation experiment e in OpenRefine, reconciling the terms we are extracting from the folk songs with his taxonomy. On the other hand, Alex was also interested in having a closer look to the list of terms we are extracting from the textile folk songs, as this could potentially enrich the taxonomy with folk-generated terms. I sent him the preliminary list and we agreed that I will keep him updated on the future development of the investigation.

I also had a series of conversation with Alex B., Helen and Tim discussing a new potential textile investigation led by Will, who is interested in working on the Gregs family, on slave trade funding of the textile sector, and linen collection from the Northern Ireland Museums. We noticed how such topic could offer a practical approach to the decolonisation working group, and I scheduled a meeting to discuss this opportunity with Will and Nina next week.