The complex disease data (genotypes), including locally-restricted samples, now consists of a total of 44,831 genotyped participants (24,709 PD cases, 17,246 Controls, and 2,876 ‘Other’ phenotypes)
The monogenic disease data (whole genome sequences) now consists of a total of 2,324 sequenced participants (1,854 PD cases, 314 Controls, and 156 ‘Other’ phenotypes)
12,585 individuals who have deep clinical phenotyping information also have matching genetic information
Additional complex disease (genotyped) and monogenic disease (whole genome) samples
Introducing locally-restricted GDPR samples via the Verily Viewpoint Workbench
Introducing clinical data for ~12,000 individuals
Introducing a new ancestry group → Complex Admixture History (CAH)
Updates in quality control measures for released genotyping data
Updates in variant calling, now with DeepVariant, for released whole genome data
Wondering if anyone is planning to or is already in the process of using data from this new release? If so, would love to hear what you are (or are going to be) working on!
Yes! I am working in some projects with this data. With a Hackaton project and other GP2 projects that I am updating with this data (Multiancestry PRS) and some pilot studies
Dear Josh, Thanks for this extensive update on GP2 data. Just to add even more information… there are multiple projects in parallel involving trainees and PIs across the globe. Project proposal applications are also open in GP2 and if anyone has ideas to work on the data it is possible to do so.
Thanks for sharing this summary!
We are already updating some analysis, but my team is most excited about the addition of metadata, such as IBD! Opening new research avenues!
Glad to see so much interest! @joanne.trinh , do you have a link to the project proposal opportunity you’re referring to? I know there is the general funding/opportunities page on GP2 Opportunities - GP2 but not sure if this is it?