Skip to content

Enhancing metadata for researchers: insights from our user survey

Book Shelves 2
Blog posts

Emma Devine | Average reading time 4 minutes

12 Nov 2024

We're working to improve metadata and make it easier for researchers to discover and identify useful datasets. In this blog post, Emma Devine, User Engagement Officer, shares insights from our recent metadata survey.

At Research Data Scotland (RDS), we strive to make evidence-based decisions to shape and develop our services with the needs of users in mind. To achieve this, we engage with users through various activities such as surveys, workshops and webinars.

In August, RDS along with colleagues at EPCC at the University of Edinburgh, ran a user survey to help understand researcher requirements around metadata when applying to access secure data. Responses from 28 researchers from across the UK provided useful insights about metadata catalogues and have helped RDS establish a baseline of knowledge around different data discovery journeys for UK researchers.

So, what is metadata? 

Metadata is data about data.

There are two main forms of metadata: findable metadata and structural metadata.

  • Findable metadata refers to properties such as the publisher of a dataset, the date of publication, and the licence applied to the dataset (which enables researchers to establish copyright and specify how and in what circumstances their research data can be reused by others).
  • Structural metadata refers to the metadata which describes a dataset and is sometimes called a Data Dictionary. For example, it may include a list of variable names, the datatype of a variable (which specifies the different sizes and values that can be stored in a variable), and some basic statistical information about each numeric variable.

When data is held securely and access is limited – such as in Trusted Research Environments - high-quality and informative metadata is crucial, as no data analysis can be carried out until secure access has been applied for and approved.

What were we trying to achieve?

Through this survey, we wanted to find out from researchers what makes metadata useful when deciding if a dataset is suitable for their research or for linkage with another dataset, especially when access to the data itself is restricted.

This will inform two programmes of work: 

Connect 4 project

Connect 4 is a first step towards connecting data in TREs across the four UK nations, starting with the Scottish National Safe Haven and the ONS Integrated Data Service (IDS). Metadata is a key focus of the project, as researchers will need to assess whether data held in one TRE is suitable for linkage with data held in another TRE, and whether both datasets are suitable for the research question. For example, if the research question relates to Scottish residents, researchers may wish to ensure that both datasets provide enough coverage of Scotland. Find out more about the Connect 4 project.

Platform changes to the RDS metadata catalogue

RDS has migrated our existing metadata catalogue to the MetadataWorks platform to facilitate federation with key partner organisations. Responses to the survey allow us to consider what changes we may look to make as we further develop our metadata catalogue to enhance metadata quality and discoverability. Learn more about recent improvements to the metadata catalogue.

What did researchers have to say? 

A total of 28 responses were received from researchers across the UK, (16 from England, 11 from Scotland and 1 from Wales) indicating preferences around how they currently interact with data dictionaries.

Most researchers responded that when deciding what datasets to use for a project, it is important to know:

  • which variables are available
  • the amount of missing data in any given variable
  • the range of data in any given variable
  • summary statistics
  • geographic spread of data

Respondents generally preferred detailed and comprehensive catalogues but a few reported their frustration with the lack of consistency in cataloguing and would prefer better descriptions and guides along with a consistent or standard approach.

Many respondents reported that metadata catalogues they currently use are insufficient and some pointed to the lack of, or inaccuracy of, metadata, and unclear indications of where linkages are possible.

The responses highlighted potential negative impacts that poor metadata can have on a research project:

  • time costs for researchers, primarily through back-and-forth with TREs to find the answers to questions left open by the metadata.
  • researchers may submit vaguer proposals for their projects, as uncertainty about the data means there is uncertainty around what can be achieved.
  • researchers might ask for more data/variables than they need, which runs counter to data minimisation principles (which ensure that organisations collect only the data they need, for specific, legitimate purposes, and retain it for no longer than necessary).

Respondents also highlighted the importance of avoiding static and dormant records, which is a good reminder for data catalogue providers to work with those responsible for managing the data (Data Asset Owners) to ensure that existing records are being updated whilst also expanding the range of datasets listed.

Overall, the responses to this survey validate assumptions about useful functionalities for the Connect 4 project to consider as it progresses. These researcher views will also guide RDS to consider how we might connect with other providers to widen our offering and provide researchers with a more comprehensive description of available datasets from a wide range of sources.

We would like to thank all the researchers who responded to this survey. We intend to get further feedback on the metadata we produce and will be planning more user engagement around metadata in the future. If you would be interested in helping RDS shape our services through activities like this, please sign up to our Engagement Contact List.

Related content

Subscribe to our updates 

To stay updated with Research Data Scotland, subscribe to our monthly newsletter and follow us on X (Twitter) and LinkedIn

Subscribe to our newsletter
Illustration of an envelope with a letter sticking out and a mobile phone with a person