Numeric Result Cleaning#
New Function
: clean_numeric
The component nested functions and their impact on certain numeric fields are explained below:
| Nested Function | Purpose |
|---|---|
_additional_check_for_known_errors |
|
_raise_if_date_in_numerical_column |
|
_merge_o2_sat_into_po2 |
|
_split_and_clean_pao2 |
|
_impute_pao2_spo2 |
|
_clean_creatinine |
|
_truncate_numerical_field |
|
_infer_ddimer_units |
|
_clean_d_dimer |
|
_merge_Ferritin_2 |
|
_check_sheffield_trops |
|
Significantly Impacted Fields#
PaO2 / SpO2#
PaO2 versus SpO2:
When requesting data, NHSx asked for PaO2 (Partial Pressure of Oxygen) with the vital signs rather than the more commonly collected SpO2 (Oxygen Saturation). As a result, some centres submitted PaO2’s from arterial blood gases (ABGs) and others SpO2’s (from a pulse oximeter).
The typical units of PaO2 (kPa) and SpO2 (%) are a factor of 10 different, allowing differentiation between centres which submitted a PaO2 and an SpO2.
3 hospitals completed this field in kPa exclusively.
Other corrections:
Royal United Hospitals Bath: 1. Put some FiO2’s in the wrong column (e.g. 0.21, 21, 26, 38, 50); and 2. Entered some blood gas values (PaO2).
Ealing Hospital and Ashford and St Peters appear to have entered FiO2 values.
Oxford University Hospitals and Liverpool Heart and Chest Hospital have entered SpO2 as a fraction rather than a percentage, i.e. 0.XX where XX is the SpO2 as a percentage.
Solutions:
Integer values between and including 21-50 are taken as FiO2 if FiO2 is blank. If FiO2 appears blank a warning is raised.
Any common oxygen fractions <=0.5, e.g. 0.21 are assumed to be FiO2 if FiO2 is blank. If FiO2 is not blank a warning is raised.
Any values for that 0.5<=PaO2<=1 holds are assumed to be SpO2 as a fraction of 1 and are multiplied by 100.
The PaO2 column is then split into pao2_gas and spo2_saturation
SpO2 is imputed from PaO2 values to merge the columns using the following equation[1]:
\[SpO_2 = \left(\frac{28.6025^3}{{PaO_{2}}^{3}}+0.99 \right)^{-1} \]
Creatinine#
A handful of values are less than 20 and unlikely to be in SI units (µmol/L):
Some are less than 0.5 and appear to be in mmol/L rather than µmol/L.
Others between 0.5 and 20.0 are decimals (and therefore unlikely to be µmol/L), but too large for mmol/L. These could be errors (e.g. decimal placement) or in mg/dL; however, they did not appear consistent with mg/dL. These values are clipped in the
clip_numericfunction.
D-Dimer#
D-Dimer may be in DDU (D-Dimer Units) or FEU (Fibrinogen Equivalent Units). The standard is now FEU and use of this was confirmed with the labs.
Some centres had values orders of magnitude different from the others as they used any of: ng/mL, μg/mL, mg/L, g/L. For all of the centres in the development data, this was checked by telephone.
At some centres, it was apparent their machine had a maximum possible value above which results were truncated (e.g. if a centre had multiple results at a maximum of exactly 10,000). The minimum maximum laboratory value identified was 10,000, and consequently results were truncated to this value as if all machines had this as their maximum.
Truncated fields#
| Feature | Minimum | Maximum |
|---|---|---|
| crp_on_admission | 4 | |
| d-dimer_on_admission | 10,000 | |
| ferritin | 15,000 | |
| troponin_i | 10 | |
| troponin_t | 5 | 25,000 |