Skip to content

Synthetic data

Read about our work on synthetic data.

About synthetic data

Synthetic data is artificial data that contains no information about real people, but follows some of the same patterns as real-world data. Each piece of information in a synthetic dataset is usually designed to be plausible, but is created at random based on the structure of original, real data.

Learn more about synthetic data, how it’s used, and why it’s useful in our public-friendly data explainer.

 

Watch this video with British Sign Language (BSL) interpretation

Our work and impact

Research Data Scotland (RDS) is working with partners and others across Scotland to develop a coordinated strategy for the production and use of synthetic data in Scotland.

Our work to date includes setting up a working group, running user workshops, and establishing the RDS synthetic data fund. We are also working with other organisations across the UK synthetic data landscape to help coordinate work around public engagement.

Synthetic data workshops

Want to know more about working with synthetic data? Join one of our free workshop!

We have hosted a number of synthetic data workshops to help participants learn more about using the synthpop package in R. Workshop attendees learn to create synthetic versions of confidential individual-level data and discuss some of the implications of using low-fidelity versus high-fidelity synthetic data.

Sign up to our engagement contact list to register your interest in joining future workshops.

Synthetic data fund: Autumn 2023

In Autumn 2023, we launched a new fund for non-commercial organisations in Scotland to explore the use of synthetic data.

Approximately £100,000 was awarded to non-commercial organisations to help support work that fits within the remit of the RDS synthetic data strategy and particularly within the following areas:

  • Disclosure risk and information governance (IG)
  • Synthesis of data
  • Access, promotion and engagement

Find out more about the synthetic data fund

Synthetic data strategy

The RDS synthetic data strategy sets out the work that RDS will lead and fund.

The strategy has been developed collaboratively with Scotland’s four Regional Safe Havens, Public Health Scotland (PHS), the Office for National Statistics (ONS), Health Data Research UK (HDR UK), NHS National Services Scotland (NSS) and others.

As part of the strategy, RDS will survey researchers/users on their synthetic data requirements; consult with data controllers and the public to gauge their understanding and concerns around synthetic data; explore options with data controllers for synthetic data generation projects; bring together IG expertise in different organisations; and map existing synthetic datasets to investigate whether we can make synthetic datasets that are already developed more widely available. To oversee and operationalise plans for synthetic data, RDS has established a Synthetic Data Working Group. The group will identify similarities and differences in synthetic data needs, governance, and access for different organisations.

Future plans

In future, RDS hopes to work with data controllers to produce synthetic datasets for training, data discovery and code development on an ongoing basis.

Related content

Was this information helpful?