Newly Released PPMI Proteomics Curated Datasets

The Parkinson’s Progression Markers Initiative (PPMI) recently released a new curated proteomics data cut. It’s important to note that the new curated proteomics data cut can be merged with other PPMI variables using the PATNO and EVENT_ID variables. This data cut encompasses six datasets from four distinct proteomics platforms, outlined below:

PPMI Project ID(s) Platform/Method Matrix Samples Analytes
151 SomaLogic 4K CSF 1,153 4,785
177 DIA-MS CSF 2,283 291
177 DIA-MS Plasma 949 186
190 OmicEra (MS) Urine 1,162 6,487
196; 222; 9000 Olink Explore 1536 CSF 799 1,463
196; 222; 9000 Olink Explore 1536 Plasma 1,031 1,463

LONI hosts both the raw data and the curated data files. Here are some details on how to access the curated files:

  1. If this is your first time accessing PPMI data, I recommend referring to the following thread for additional background information, including instructions on how to gain access: An Introduction to the PPMI Dataset.

  2. Go to the LONI login page and navigate to the Biospecimen: Proteomic Analysis section. The data cut is labeled as “PPMI Proteomic Working Group Curated Datasets” (Version: 2024-05-29) and downloads as a single zip file containing 6 Excel files, 1 CSV file, and 1 PDF.

  3. Additionally, the PPMI Data User Guide serves as an introduction and reference for understanding PPMI data. It provides instructions on interpreting PPMI data, including guidance on accessing curated data cuts (Section 11) and individual project proteomics data (Section 8).

Curious if anyone has plans or interest in utilizing the data from this new release? Please feel free to send me a message or reply here!

8 Likes

How exciting! Can’t wait for the research community to utilize this rich resource of proteomics data.

1 Like

So exciting! Amazing work to get this all curated. Thanks for the directions on how to get the data!

2 Likes

This is very cool - I was wondering if there is an easy why to determine which of these samples also have genetic and epidemiologic data… I would assume near all?