beta.tpp schema🔗
Available on backends: TPP
This defines all the data (both primary care and externally linked) available in the OpenSAFELY-TPP backend. For more information about this backend, see the SystmOne Primary Care section.
from ehrql.tables.beta.tpp import (
addresses,
apcs_cost,
appointments,
clinical_events,
ec_cost,
emergency_care_attendances,
hospital_admissions,
household_memberships_2020,
isaric_raw,
medications,
occupation_on_covid_vaccine_record,
ons_deaths,
opa_cost,
opa_diag,
opa_proc,
open_prompt,
patients,
practice_registrations,
sgss_covid_all_tests,
vaccinations,
)
many rows per patient
addresses🔗
-
address_id
🔗
integer
-
start_date
🔗
date
-
end_date
🔗
date
-
address_type
🔗
integer
-
rural_urban_classification
🔗
integer
-
imd_rounded
🔗
integer
-
- Always >= 0, <= 32800, and a multiple of 100
-
msoa_code
🔗
string
-
- Matches regular expression:
E020[0-9]{5}
- Matches regular expression:
-
has_postcode
🔗
boolean
-
care_home_is_potential_match
🔗
boolean
-
care_home_requires_nursing
🔗
boolean
-
care_home_does_not_require_nursing
🔗
boolean
-
for_patient_on(date)
🔗
-
Return each patient's registered address as it was on the supplied date.
Where there are multiple registered addresses we prefer any which have a known postcode (though we never have access to this postcode) as this is used by TPP to cross-reference other data associated with the address, such as the MSOA or index of multiple deprevation.
Where there are multiple of these we prefer the most recently registered address and then, if there are multiple of these, the one with the longest duration. If there's stil an exact tie we choose arbitrarily based on the address ID.
View method definition
spanning_addrs = addresses.where(addresses.start_date <= date).except_where( addresses.end_date < date ) ordered_addrs = spanning_addrs.sort_by( case(when(addresses.has_postcode).then(1), default=0), addresses.start_date, addresses.end_date, addresses.address_id, ) return ordered_addrs.last_for_patient()
many rows per patient
apcs_cost🔗
many rows per patient
appointments🔗
Appointments in primary care.
You can find out more about this table in the short data report. To view it, you will need a login for OpenSAFELY Jobs and the Project Collaborator or Project Developer role for the project. The workspace shows when the code that comprises the report was run; the code itself is in the appointments-short-data-report repository on GitHub.
Tip
Querying this table is similar to using Cohort Extractor's
patients.with_gp_consultations
function. However, that function filters by
the status of the appointment. To achieve a similar result with this table:
appointments.where(
appointments.status.is_in([
"Arrived",
"In Progress",
"Finished",
"Visit",
"Waiting",
"Patient Walked Out",
])
)
-
booked_date
🔗
date
-
The date the appointment was booked
-
start_date
🔗
date
-
The date the appointment was due to start
-
status
🔗
string
-
The status of the appointment
- Possible values:
Booked
,Arrived
,Did Not Attend
,In Progress
,Finished
,Requested
,Blocked
,Visit
,Waiting
,Cancelled by Patient
,Cancelled by Unit
,Cancelled by Other Service
,No Access Visit
,Cancelled Due To Death
,Patient Walked Out
- Possible values:
many rows per patient
clinical_events🔗
many rows per patient
ec_cost🔗
many rows per patient
emergency_care_attendances🔗
-
id
🔗
integer
-
arrival_date
🔗
date
-
discharge_destination
🔗
SNOMED-CT code
-
diagnosis_01
🔗
SNOMED-CT code
-
diagnosis_02
🔗
SNOMED-CT code
-
diagnosis_03
🔗
SNOMED-CT code
-
diagnosis_04
🔗
SNOMED-CT code
-
diagnosis_05
🔗
SNOMED-CT code
-
diagnosis_06
🔗
SNOMED-CT code
-
diagnosis_07
🔗
SNOMED-CT code
-
diagnosis_08
🔗
SNOMED-CT code
-
diagnosis_09
🔗
SNOMED-CT code
-
diagnosis_10
🔗
SNOMED-CT code
-
diagnosis_11
🔗
SNOMED-CT code
-
diagnosis_12
🔗
SNOMED-CT code
-
diagnosis_13
🔗
SNOMED-CT code
-
diagnosis_14
🔗
SNOMED-CT code
-
diagnosis_15
🔗
SNOMED-CT code
-
diagnosis_16
🔗
SNOMED-CT code
-
diagnosis_17
🔗
SNOMED-CT code
-
diagnosis_18
🔗
SNOMED-CT code
-
diagnosis_19
🔗
SNOMED-CT code
-
diagnosis_20
🔗
SNOMED-CT code
-
diagnosis_21
🔗
SNOMED-CT code
-
diagnosis_22
🔗
SNOMED-CT code
-
diagnosis_23
🔗
SNOMED-CT code
-
diagnosis_24
🔗
SNOMED-CT code
many rows per patient
hospital_admissions🔗
one row per patient
household_memberships_2020🔗
Inferred household membership as of 2020-02-01, as determined by TPP using an as yet undocumented algorithm.
many rows per patient
isaric_raw🔗
A subset of the ISARIC data.
These columns are deliberately all taken as strings while in a preliminary phase. They will later change to more appropriate data types.
Descriptions taken from: CCP_REDCap_ISARIC_data_dictionary_codebook.pdf
-
age
🔗
string
-
Age
-
age_factor
🔗
string
-
TODO
-
calc_age
🔗
string
-
Calculated age (comparing date of birth with date of enrolment). May be inaccurate if a date of February 29 is used.
-
sex
🔗
string
-
Sex at birth.
-
ethnic___1
🔗
string
-
Ethnic group: Arab.
-
ethnic___2
🔗
string
-
Ethnic group: Black.
-
ethnic___3
🔗
string
-
Ethnic group: East Asian.
-
ethnic___4
🔗
string
-
Ethnic group: South Asian.
-
ethnic___5
🔗
string
-
Ethnic group: West Asian.
-
ethnic___6
🔗
string
-
Ethnic group: Latin American.
-
ethnic___7
🔗
string
-
Ethnic group: White.
-
ethnic___8
🔗
string
-
Ethnic group: Aboriginal/First Nations.
-
ethnic___9
🔗
string
-
Ethnic group: Other.
-
ethnic___10
🔗
string
-
Ethnic group: N/A.
-
covid19_vaccine
🔗
string
-
Has the patient received a Covid-19 vaccine (open label licenced product)?
-
covid19_vaccined
🔗
date
-
Date first vaccine given (Covid-19) if known.
-
covid19_vaccine2d
🔗
date
-
Date second vaccine given (Covid-19) if known.
-
covid19_vaccined_nk
🔗
string
-
First vaccine given (Covid-19) but date not known.
-
corona_ieorres
🔗
string
-
Suspected or proven infection with pathogen of public health interest.
-
coriona_ieorres2
🔗
string
-
Proven or high likelihood of infection with pathogen of public health interest.
-
coriona_ieorres3
🔗
string
-
Proven infection with pathogen of public health interest.
-
inflammatory_mss
🔗
string
-
Adult or child who meets case definition for inflammatory multi-system syndrome (MIS-C/MIS-A).
-
cestdat
🔗
date
-
Onset date of first/earliest symptom.
-
chrincard
🔗
string
-
Chronic cardiac disease, including congenital heart disease (not hypertension).
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
hypertension_mhyn
🔗
string
-
Hypertension (physician diagnosed).
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
chronicpul_mhyn
🔗
string
-
Chronic pulmonary disease (not asthma).
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
asthma_mhyn
🔗
string
-
Asthma (physician diagnosed).
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
renal_mhyn
🔗
string
-
Chronic kidney disease.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
mildliver
🔗
string
-
Mild liver disease.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
modliv
🔗
string
-
Moderate or severe liver disease
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
chronicneu_mhyn
🔗
string
-
Chronic neurological disorder.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
malignantneo_mhyn
🔗
string
-
Malignant neoplasm.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
chronichaemo_mhyn
🔗
string
-
Chronic haematologic disease.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
aidshiv_mhyn
🔗
string
-
AIDS/HIV.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
obesity_mhyn
🔗
string
-
Obesity (as defined by clinical staff).
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
diabetes_type_mhyn
🔗
string
-
Diabetes and type.
- Possible values:
NO
,1
,2
,N/K
- Possible values:
-
diabetescom_mhyn
🔗
string
-
Diabetes with complications.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
diabetes_mhyn
🔗
string
-
Diabetes without complications.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
rheumatologic_mhyn
🔗
string
-
Rheumatologic disorder.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
dementia_mhyn
🔗
string
-
Dementia.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
malnutrition_mhyn
🔗
string
-
Malnutrition.
- Possible values:
YES
,NO
,Unknown
- Possible values:
-
smoking_mhyn
🔗
string
-
Smoking.
- Possible values:
Yes
,Never Smoked
,Former Smoker
,N/K
- Possible values:
-
hostdat
🔗
date
-
Admission date at this facility.
-
hooccur
🔗
string
-
Transfer from other facility?
-
hostdat_transfer
🔗
date
-
Admission date at previous facility.
-
hostdat_transfernk
🔗
string
-
Admission date at previous facility not known.
-
readm_cov19
🔗
string
-
Is the patient being readmitted with Covid-19?
-
dsstdat
🔗
date
-
Date of enrolment.
-
dsstdtc
🔗
date
-
Outcome date.
many rows per patient
medications🔗
many rows per patient
occupation_on_covid_vaccine_record🔗
-
is_healthcare_worker
🔗
boolean
many rows per patient
ons_deaths🔗
-
date
🔗
date
-
Patient's date of death. Only deaths registered from February 2019 are recorded.
-
place
🔗
string
-
- Possible values:
Care Home
,Elsewhere
,Home
,Hospice
,Hospital
,Other communal establishment
- Possible values:
-
underlying_cause_of_death
🔗
ICD-10 code
-
cause_of_death_01
🔗
ICD-10 code
-
cause_of_death_02
🔗
ICD-10 code
-
cause_of_death_03
🔗
ICD-10 code
-
cause_of_death_04
🔗
ICD-10 code
-
cause_of_death_05
🔗
ICD-10 code
-
cause_of_death_06
🔗
ICD-10 code
-
cause_of_death_07
🔗
ICD-10 code
-
cause_of_death_08
🔗
ICD-10 code
-
cause_of_death_09
🔗
ICD-10 code
-
cause_of_death_10
🔗
ICD-10 code
-
cause_of_death_11
🔗
ICD-10 code
-
cause_of_death_12
🔗
ICD-10 code
-
cause_of_death_13
🔗
ICD-10 code
-
cause_of_death_14
🔗
ICD-10 code
-
cause_of_death_15
🔗
ICD-10 code
many rows per patient
opa_cost🔗
many rows per patient
opa_diag🔗
-
opa_ident
🔗
integer
-
TODO
- Never
NULL
- Never
-
primary_diagnosis_code
🔗
ICD-10 code
-
TODO
-
primary_diagnosis_code_read
🔗
CTV3 (Read v3) code
-
TODO
-
secondary_diagnosis_code_1
🔗
ICD-10 code
-
TODO
-
secondary_diagnosis_code_1_read
🔗
CTV3 (Read v3) code
-
TODO
-
appointment_date
🔗
date
-
TODO
-
referral_request_received_date
🔗
date
-
TODO
many rows per patient
opa_proc🔗
-
opa_ident
🔗
integer
-
TODO
- Never
NULL
- Never
-
primary_procedure_code
🔗
OPCS-4 code
-
TODO
-
primary_procedure_code_read
🔗
CTV3 (Read v3) code
-
TODO
-
procedure_code_1
🔗
OPCS-4 code
-
TODO
-
procedure_code_2_read
🔗
CTV3 (Read v3) code
-
TODO
-
appointment_date
🔗
date
-
TODO
-
referral_request_received_date
🔗
date
-
TODO
many rows per patient
open_prompt🔗
This table contains responses to questions from the OpenPROMPT project.
You can find out more about this table in the associated short data report. To view it, you will need a login for Level 4. The workspace shows when the code that comprises the report was run; the code itself is in the airmid-short-data-report repository on GitHub.
-
ctv3_code
🔗
CTV3 (Read v3) code
-
The response to the question, as a CTV3 code. Alternatively, if the question does not admit a CTV3 code as the response, then the question, as a CTV3 code.
- Never
NULL
- Never
-
snomedct_code
🔗
SNOMED-CT code
-
The response to the question, as a SNOMED CT code. Alternatively, if the question does not admit a SNOMED CT code as the response, then the question, as a SNOMED CT code.
-
creation_date
🔗
date
-
The date the survey was administered
- Never
NULL
- Never
-
consultation_date
🔗
date
-
The response to the question, as a date, if the question admits a date as the response. Alternatively, the date the survey was administered.
- Never
NULL
- Never
-
consultation_id
🔗
integer
-
The ID of the survey
- Never
NULL
- Never
-
numeric_value
🔗
float
-
The response to the question, as a number
one row per patient
patients🔗
Patients in primary care.
Representativeness🔗
You can find out more about the representativeness of these data in the OpenSAFELY-TPP backend in:
The OpenSAFELY Collaborative, Colm D. Andrews, Anna Schultze, Helen J. Curtis, William J. Hulme, John Tazare, Stephen J. W. Evans, et al. 2022. "OpenSAFELY: Representativeness of Electronic Health Record Platform OpenSAFELY-TPP Data Compared to the Population of England." Wellcome Open Res 2022, 7:191. https://doi.org/10.12688/wellcomeopenres.18010.1
Orphan records🔗
If a practice becomes aware that a patient has moved house, then the practice deducts, or removes, the patient's records from their register. If the patient doesn't register with a new practice within a given amount of time (normally from four to eight weeks), then the patient's records are permanently deducted and are orphan records. There are roughly 1.6 million orphan records.
-
age_on(date)
🔗
-
Patient's age as an integer, in whole elapsed calendar years, as it would be on the given date.
This method takes no account of whether the patient is alive on the given date. In particular, it may return negative values if the given date is before the patient's date of birth.
View method definition
return (date - patients.date_of_birth).years
many rows per patient
practice_registrations🔗
Each record corresponds to a patient's registration with a practice.
Only patients with a full GMS (General Medical Services) registration are included.
We have registration history for:
- all patients currently registered at a TPP practice
- all patients registered at a TPP practice any time from 1 Jan 2009 onwards:
- who have since de-registered
- who have since died
A patient can be registered with zero, one, or more than one practices at a given time. For instance, students are often registered with a practice at home and a practice at university.
-
start_date
🔗
date
-
Date patient joined practice.
- Never
NULL
- Never
-
end_date
🔗
date
-
Date patient left practice.
-
practice_pseudo_id
🔗
integer
-
Pseudonymised practice identifier.
- Never
NULL
- Never
-
practice_stp
🔗
string
-
ONS code of practice's STP (Sustainability and Transformation Partnership). STPs have been replaced by ICBs (Integrated Care Boards), and ICB codes will be available soon.
- Matches regular expression:
E540000[0-9]{2}
- Matches regular expression:
-
practice_nuts1_region_name
🔗
string
-
Name of the NUTS level 1 region of England to which the practice belongs. For more information see: https://www.ons.gov.uk/methodology/geography/ukgeographies/eurostat
- Possible values:
North East
,North West
,Yorkshire and The Humber
,East Midlands
,West Midlands
,East
,London
,South East
,South West
- Possible values:
-
for_patient_on(date)
🔗
-
Return each patient's practice registration as it was on the supplied date.
Where a patient is registered with multiple practices we prefer the most recent registration and then, if there are multiple of these, the one with the longest duration. If there's stil an exact tie we choose arbitrarily based on the practice ID.
View method definition
spanning_regs = practice_registrations.where(practice_registrations.start_date <= date).except_where( practice_registrations.end_date < date ) ordered_regs = spanning_regs.sort_by( practice_registrations.start_date, practice_registrations.end_date, practice_registrations.practice_pseudo_id, ) return ordered_regs.last_for_patient()
many rows per patient
sgss_covid_all_tests🔗
many rows per patient