Useful PPMI Clinical Codes - Code Available

Hello, everyone!

As part of my project in the Data Modality and Methodology Task Force, I aimed to create and share Python scripts that generate clinician- and research-relevant data based on PPMI data. The project is currently in its beta version and is publicly shared with everyone via the Research Community’s GitHub.

The currently available notebooks include:

  1. LEDD and medication-specific LEDD calculations
  2. Levodopa challenge responsiveness
  3. Medical conditions identifier
  4. Medication usage identifier

Each notebook includes a theoretical background and references to research articles that justify its utility. The goal is to enable other researchers to use this data to identify meaningful clinical correlations. I believe this will be particularly useful for those with limited clinical exposure, who may sometimes struggle to identify relevant clinical outcomes to study.

For example, a researcher working with microbiome data may find it useful to check if a specific microbiome pattern or cluster is associated to the degree of levodopa responsiveness (which makes clinical sense, as disabsorption due to the intestinal bacterial profile may impair the levodopa response).

Another example is that a researcher working with neuroimaging data may try to identify some specific neuroimaging patterns that could predict longitudinal LEDD evolution. Lastly, given the relevance of the association of some medical conditions to PD, we can also use those scripts to check, for instance, if any kind of commorbidity (like diabetes) would be associated with a worsened clinical progression.

Please let me know if you find any errors in the scripts, share your thoughts on their usefulness, and suggest any additional relevant clinical data that should be included.

5 Likes

This is a great resource! Thanks for creating this and making it avaialble!

Daniel, this looks awesome! I made a pull request and you’re welcome to review it and approve if you’re happy with it. :slight_smile:

I added a utils directory with a helpers.py script - these functions can be used across all notebooks. I thought it would be useful to grab the string of the basename of the file, since the date you download the file from LONI is appended to the filename, so I wrote a function (get_lastest_file()) that looks in the data directory and takes the most recent file you downloaded, ie., if you downloaded MDS-UPDRS_Part_III_09Feb2025.csv, and I downloaded the same file on the 27th and mine was called MDS-UPDRS_Part_III_27Feb2025.csv, the notebook now looks for MDS-UPDRS_Part_III*.csv. If there is more than one file that matches, it takes the most recent.

The other function future proofs - pandas will depreciate pd.to_numeric, errors='ignore', and it’s a message for now but could be an error in the future. safe_to_numeric() does the same thing, or at least it should. Feel free to double check! Wouldn’t want it to change the output file.

Oh! And I added .DS_Store to the .gitignore - this is an issue I keep having since I use a Mac.

Hi, Elizabeth, and thank you for your suggestions! I see that Josh already approved your commit and I’m very thankful for your suggestions, as they make the code easier to use!

1 Like