Summary information

Study title

Annual Survey of Hours and Earnings, 2020: Synthetic Data Pilot

Creator

Office for National Statistics

Study number / PID

9045 (UKDA)

10.5255/UKDA-SN-9045-1 (DOI)

Data access

Restricted

Series

Not available

Abstract

Abstract copyright UK Data Service and data collection copyright owner.The Annual Survey of Hours and Earnings, 2020: Synthetic Data Pilot is a synthetic version of the Annual Survey of Hours and Earnings (ASHE) study available via Trusted Research Environments (TREs).  ASHE is one of the most extensive surveys of the earnings of individuals in the UK. Data on the wages, paid hours of work, and pensions arrangements of nearly one per cent of the working population are collected. Other variables relating to age, occupation and industrial classification are also available. The ASHE sample is drawn from National Insurance records for working individuals, and the survey forms are sent to their respective employers to complete. ASHE is available for research projects demonstrating public good to accredited or approved researchers via TREs such as the Office for National Statistics Secure Research Service (SRS) or the UK Data Service Secure Lab (at SN 6689). To access collections stored within TREs, researchers need to undergo an accreditation process. Gaining access to data in a secure environment can be time and resource intensive. This pilot has created a low fidelity, low disclosure risk synthetic version of ASHE data, which can be made available to researchers more quickly while they wait for access to the real data.The synthetic data were created using the Synthpop package in R.  The sample method was used; this takes a simple random sample with replacement from the real values. The project was carried out in the period between 19th December 2022 and 3rd January 2023.  Further information is available within the documentation. User feedback received through this pilot will help the ONS to maximise benefits of data access and further explore the feasibility of synthesising more data in future.Main Topics:The ASHE synthetic data contain the same variables as ASHE for each individual, relating to wages, hours of work, pension arrangements, and...
Read more

Methodology

Data collection period

19/12/2022 - 03/01/2023

Country

United Kingdom

Time dimension

Cross-sectional (one-time) study

Analysis unit

Institutions/organisations
Individuals
National

Universe

This is synthetic data based on the same population as the main ASHE study (ASHE 2020), i.e. working individuals aged from 16 years residing and working in the UK.

Sampling procedure

Simple random sample

Kind of data

Numeric

Data collection mode

Compilation/Synthesis

Access

Publisher

UK Data Service

Publication year

2023

Terms of data access

The Data Collection is available to UK Data Service registered users subject to the End User Licence Agreement.

Registered users must have or gain DEA Accredited Researcher Status.

Use of the data requires approval from the data owner or their nominee.

Related publications

Not available