Dates
=====

Original NCCID Cleaning Pipeline
--------------------------------

NCCID Function
: `_parse_date_columns`

1.  Converts fields expected in US date format MM/DD/YY into `pd.datetime` dates.
2.  Dates are pulled out from entries with known errors of the form '[Text] - YYYY-MM-DD'.
3.  Other known errors e.g., entries of '.', ' ', and unknown errors are parsed as `pd.NaT`.

---

NCCIDxClean
-----------

New Function
: `parse_date_columns`

1.  Convertion of dates stored as numbers in the excel date format, which previously would have been lost (set to `np.nan`).
2.  Swab dates for three centres are converted to correct format. It was identified that these were in UK format rather than the expected US format.
3.  Cleaning of date_of_positive_covid_swab to change the entry to the earliest positive swab date provided, as it was noted that for a substantial number of patients:
    -   Their 'positive' swab date was equal to their PCR result date rather than the aquisition date; and/or
    -   Their 'positive' swab date was equal to their second PCR rather than the first.
