Unlocking the potential of synthetic data
Reflecting on how synthetic data can plug the data gap.
Dr Lynne Adair
19 Dec 2022
Research Data Scotland
19 Oct 2023
Research Data Scotland (RDS) has created a new fund for non-commercial organisations in Scotland to explore the use of synthetic data.
[Update 24/11/2023: Applications for the Synthetic Data Fund are now closed]
RDS is pioneering an approach to using synthetic data in Scotland, recognising the untapped potential of this tool to allow researchers to test approaches while waiting for the necessary permissions to use actual data.
Synthetic data is ‘a new copy of a data set that is generated at random but made to follow the structure and some of the patterns of the original data set’ (Accelerating public policy research with synthetic data: ADR Scotland). The data can be designed to replicate the statistical properties of a real-world dataset without including identifiable information.
Approximately £100,000 is available for non-commercial organisations to help support work that fits within the remit of the RDS synthetic data strategy and particularly within the following areas:
The RDS synthetic data strategy aims to move forward the production and use of synthetic data in Scotland to improve and speed up research access to public sector datasets.
RDS has conducted user and public engagement work around synthetic data which can be used to approach data controllers and Information Asset Owners to begin conversations around the benefits and use cases of synthetic data. RDS has also previously funded some interesting synthetic data projects looking at synthetic data tools and creating advice for data controllers on assessing the disclosure risk of synthetic data.
Dr Lynne Adair, Data Curation Manager and RDS lead on synthetic data, said:
She added: “Pilot synthetic data synthesis projects are also of interest, to develop usable synthetic datasets for different use cases, such as researcher training, assessing data feasibility, such as to augment the metadata, and code development, and to test the feasibility and acceptability of these with researchers and data owners.”
Another area of interest might be the open-source synthetic-data generating tool called ‘synthpop’. RDS will be taking over the maintenance of synthpop in 2024 and project proposals interested in looking at how synthpop can be made more accessible for the wider research community would also be welcomed. Proposals to do further work to improve the functionality of synthpop could be funded, and to develop training for users, for example.
Dr Adair added: “Funding is not restricted to these areas, and we are interested in all ideas as to how synthetic data can be used to improve research access to public sector data.”
Approximately £100k is available for synthetic data funding. The proposed projects must be for new synthetic data work that would not otherwise happen. RDS will not subsidise work that is already being carried out or that already has a funding source.
Projects should aim to start early in 2024 and run for up to 12 months.
Related content
Reflecting on how synthetic data can plug the data gap.
Dr Lynne Adair
19 Dec 2022
Our interview series shines a light on what it’s like to work at RDS. Meet Dr Lynne Adair, Research Data Scotland Data Curation Manager.
Research Data Scotland
05 May 2023
To stay updated with Research Data Scotland, subscribe to our mailing list or follow us on X (Twitter) and LinkedIn.