Several groups have mentioned the desire for a “date of PD diagnosis” for a PPMI participant. This would be useful for many analyses, especially for predicting phenoconversion in prodromal groups. Due to the complicated nature of defining this data point and organizing it in the PPMI data set, we thought it would make a good community project.
Problem:
The current PPMI dataset doesn’t seem to have a pre-defined data field for date of diagnosis that works longitudinally.
- “PD_Diagnosis_History” is only logged at SC (screening) event.
- NSD-ISS stage is very sparsely populated.
- Several other fields from the curated cut were analyzed but no fields are densely populated which seem curated for this purpose.
Proposed solution:
- Identify fields in the data set that could be used to inform a diagnosis milestone / date.
- Possible ideas:
- Levodopa equivalent daily dose (LEDD > 0 and persistent) is the best timestamped proxy for clinical PD diagnosis.
- Abnormal DaTscan reads and NSD staging can provide biological confirmation.
It’s possible that there just isn’t enough data captured to do this reliably, but that is something worth investigating.
Request for comments from others:
- Do you have a need for such a field in your research? If so, please elaborate on your use case and let us know if the above proposal might or might not be helpful for you.
- Do you have any ideas for values in the PPMI data set that could assist with this project?
- Would you like to participate in development of this derived variable?
Thanks!
Alan
3 Likes
Dear Alan,
This is an excellent initiative. I agree the diagnosis tracker would be a highly valuable community resource to anchor analyses around a reproducible “Day 0” per PPMI participant.
How I’d define Day 0 (initial proposal)
-
Primary: clinician-confirmed PD diagnosis date (if captured).
-
Secondary: sustained dopaminergic treatment start — Your LEDD idea is compelling; I’d encode it explicitly as “persistent LEDD for ≥2 visits or ≥90 days” to avoid false positives from one-off trials.
-
Tertiary: abnormal DaTscan + compatible clinical visit within ±90 days (anchored to the clinical timepoint).
-
Fallback: year-of-diagnosis at screening (impute mid-year) with a low-confidence flag; otherwise provide a conversion window (last “no PD” → first “PD” evidence).
Each record would carry provenance and quality fields (e.g., dx_source, dx_confidence, dx_window_start/end, notes about brief levodopa trials or non-PD indications).
Clinical features & harmonization
-
Motor: MDS-UPDRS III and key items (e.g., bradykinesia) plus Hoehn & Yahr transitions (0→1/2) as candidate anchors.
-
Non-motor: RBDSQ/REM, UPSIT/olfaction, SCOPA-AUT, MDS-NMS, MoCA, etc.
-
I’m in favor of documenting mappings to harmonize related scales so we can compare cohorts that were scored differently.
Prodromal → phenoconversion modeling
-
For prodromal cohorts, compute delta/derivative features (motor and non-motor) and model phenoconversion (0/1) or time-to-conversion.
-
Feature selection and dimensionality reduction + clustering to identify the strongest predictors.
-
Survival setups with interval-censoring where only a window is known.
Biomarkers
- Use the clinical Day 0 to help propose biomarker cutoffs (DaTscan SBR, CSF markers, NfL), then test associations and predictive utility with calibration and decision-usefulness metrics.
I don’t yet have PPMI data access to check field coverage, but I’d love to contribute. Please let me know what you think about these ideas!
Warm regards,
Ana
2 Likes
Thanks for the detailed response! This is all helpful.
Will reach out when we have the opportunity to start work on this.
2 Likes