When Knowledge Becomes Data

A conversation about heritage data management

A person sitting at a desk faces two computer monitors showing a map and graph

Person using Arches platform at a desktop computer

By Jeffrey Levin, Cole Calhoun

Nov 02, 2022

Social Sharing

Body Content

Digital data management plays a key role in the conservation field.

Data produced during conservation field work must be thoughtfully managed in a way that sustains it, shares it, and makes it accessible well into the future. There are many challenges to digitally managing data of cultural heritage, including defining data, managing expectations, funding, advances in technology, and more.

These topics and more were discussed in a roundtable of conservation professionals led by Catherine Patterson, scientist at the Getty Conservation Institute, and Jeffrey Levin, editor of Conservation Perspectives, The GCI Newsletter. This is an excerpt of the full interview in that publication, which can be read online.

Mahmoud Abdelrazek, trained in geology, computational archaeology, and database management, is a senior research data consultant doing advanced research computing at University College London (UCL).

Joe Padfield is a principal scientist at the National Gallery in London, where he is responsible for several of the National Gallery’s collaborative national and international external research projects.

Mario Santana-Quintero is a professor at Carleton University in Ottawa, Canada, and a Carleton Immersive Media Studio Lab faculty member, whose research focuses on digital workflows for the conservation of historic places and an ethical framework for the digital recording of historic places. He also serves as secretary general of the International Council on Monuments and Sites.

CP: What role does digital data play in the heritage field—and more specifically, what do you see as the relationship between digital data and heritage preservation itself?

MA: I’d classify the role data plays. Let’s say we have a monument we want to conserve. We have knowledge today to do 3D capture of a site’s physical characteristics that can enable people to experience it without having to visit it. The other sort of thing is what I’d call modeling—not as in a 3D representation but rather in capturing the essence of a site and distilling it into certain attributes. Eventually this data funnels into a study. In an ideal world there would be one unified format for the 3D modeling of a site and one unified system for collecting information about a particular site. This is how I see the digital realm helping conservation. Once you collect information about a site and recognize its importance, perhaps this will support efforts to conserve its physical life. Making knowledge about a site accessible to people may lead to action that preserves the site. We can use digital systems to do this in an efficient way.

JL: There are many inputs into cultural heritage data about a particular place—its current condition, its materiality, its historical context—and you want to capture that information in a holistic way. Is one of the challenges with heritage data that there’s a tendency to look at one part of the elephant, so to speak, as opposed to seeing the entire elephant?

JP: The problem with heritage data in general is one of imagination and managing expectations. Whether it’s a broken pot or it’s the Mona Lisa, these works are unique, and any examination of them is done at a certain point in time. Even something on a gallery wall is slowly degrading over time. As for recording an archaeological excavation, once you’ve dug it up, you can’t really do it again. You use the most efficient available technique you can to capture all of the information you can, but the reality is that you only have so many resources, so much time, and so many specialists who know how to use the equipment. There’s this expectation that digital is easy—you push a button, and it happens—but digital data requires as much work and specialists’ knowledge as anything else. Heritage calls out for the best available approach, but practitioners, perhaps sitting in a small studio, are thinking, “How can I possibly compete with something that’s been done on the pyramids or at the Louvre?”—and then they just do what they’ve done before. Managing expectations in the digital age is tricky. We see what can be done, and people want to engage in best practice, but then reality kicks in.

MSQ: We’re always seeking funding to get the best equipment we can. But what we’re capable of doing is one thing and what the conservators are capable of doing with the data we provide is another. For example, when we did a project for the GCI, we produced about four terabytes of data. When I brought it to the Institute, I was told that four terabytes is the entire server size. We collected huge amounts of information, but how are we going to store it—and if we continue, how are we going to preserve more? Another thing is the incredible advance in technology. There’s a new device I’m interested in purchasing, but it’s not simply expensive—in addition, the data processing is done in the cloud, which you pay for yearly. There’s this big move to cloud processing, and I don’t know how conservation institutions will deal with continuously paying for data processing.

CP: You’ve all brought up challenges in data management—expertise, the perceptions of data collectors, how to define and unify data, managing expectations, the vast advances in technology, and the costs associated with maintaining data. Are there other challenges in this area?

JP: I’d add the variables in communication. Heritage science in general covers a huge range of domains. When you’re translating anything complicated from one domain to another, there are communication problems. The difficulty with digital data is the expectation that it should be translated to and for everybody, and that it be immediate and automatic. If you’re looking at using new equipment, you need to work out how to interpret the data for the specialists on the topic, and then interpret it for anybody else and the public. How you transfer understanding of the data is a key issue.

MSQ: Another challenge is responsibility for the data. When it comes to heritage, digital data gives you some kind of power and credibility—and that credibility is very important for many organizations. How can we use those digital assets? Should I be entitled to use the data I collect to promote myself? Is the data I collect ultimately benefiting the local community that lives near that site?

JL: Are there certain things that are foundational with respect to what constitutes best practice in the creation of cultural heritage data, how the data ends up in the repositories, and how it gets managed in repositories?

MA: When you start on a project, you’ve got to have a management plan. Included in the plan should be consideration for FAIR data—Findable, Accessible, Interoperable, and Reusable. The mistake people make is that they assume that data should be findable, accessible, interoperable, and reusable by humans. It should be findable and so forth by a machine too, meaning that the information should be in the same logical system that another machine can read. One way that digital data is lost is through the loss of the physical storage the data is stored on. Even if you have the actual storage, you might not have the software that can read the data. This is the case with a lot of banking systems—they try to sustain old database systems because they can’t risk moving to a new system with information possibly being lost in the migration. In heritage, we have the opportunity to define certain schema. Some institutions have their own data repository and enforce certain metadata schema to go with it. The most famous of them is DataCite, an organization that issues digital object identifiers, and they have a schema. Obviously, the schema is not specific to a particular media type, but this is the trade-off that you have to have between describing an object with all its attributes or being general. If you want something standardized, you must be somewhere in between. A good solution is to start with a data management plan, include those attributes in it, and plan at the end of a project for the data to go into some sort of repository, including metadata that goes with it in a format that can be found and used by a machine.

Back to Top

Stay Connected

  1. Get Inspired

    A young man and woman chat about a painting they are looking at in a gallery at the J. Paul Getty Museum.

    Enjoy stories about art, and news about Getty exhibitions and events, with our free e-newsletter

  2. For Journalists

    A scientist in a lab coat inspects several clear plastic samples arrayed in front of her on a table.

    Find press contacts, images, and information for the news media