If you need study design, epidemiology, metabolomics insights

Then ping me - I’m a jack of many trades as an epidemiologist in neurodegenerative diseases, from genetics to quality of life studies, including modelling symptoms to metabolomics/biomarker analyses. I also teach various epidemiology courses - so I am happy to share insights/answer any questions (i.e. how best to tackle confounders, etc).

F

5 Likes

Loved your post and call to colaboration, and I dearly hope you get it! As for me, I’m open for it, but first, I need to finish some projects, or I risk going crazy hahaha.

By the way, I’m deeply interested in the concept of Population Attributable Fraction. Can you suggest me some good materials to learn more about this?

2 Likes

Hi @fbbriggs, I’m newer to epidemiology and related stats and have been enjoying learning more about how to apply epidemiology principles to my work.

I recently have discovered propensity score methods and weighting. With your experience, do you have any recommendations on specific propensity score weighting algorithms when it comes to survival models? I’ve been dabbling in accelerated failure time models with propensity score weighting and was curious to hear your thoughts on it.

Thanks!

Matt

1 Like

Hey @danieltds, see this example from Boston University’s SPH: The Population Attributable Fraction

Population attributable fraction, also known as population attributable risk percent (PAR%), answers the question: What proportion of the disease risk in the total population is due to a specific exposure? It answers a causal question.

Here is a screenshot of a few slides from one of my intro EPI lectures, where we are starting with information from a cohort study. In a cohort study, we can calculate risk (incidence) of an outcome (e.g PD) in an exposed and unexposed group (e.g. smokers vs non-smokers) - which we can leverage to estimate risk in the general population (see the first few slides). If you have the risk of disease in the general population then you can start at slide 73 (where AR = attributable risk which is: risk in exposed - risk in unexposed):

Happy to chat more!

2 Likes

Hey @mattk, first welcome!

I am very reluctant to endorse propensity score methods, primarily because it is misused in clinical epidemiology as the fix-all for any question that is using slightly complex data for research. PSM is a tool to help control confounding that might influence the relationship of interest (e.g some exposure/treatment with an outcome). A confounder is a variable that is causally associated with the outcome (meaning if you drew the direction of the relationship, an arrow would originate from the variable and end pointing to the outcome), it must also be causally or non-causally associated with the exposure/treatment of interest, and it cannot be an intermediate between the exposure and outcome. What confounders do are muddy the observed association between an exposure and outcome. A classic example is suppose we examine the relationship between pocket lighters and lung cancer - there will be a positive relationship that could be misinterpreted that lighters increase the risk of lung cancer. However, the likely confounder is smoking status. Smokers have an increased risk of lung cancer (it is causally associated with lung cancer) and smokers are more likely to carry lighters (it is associated with the exposure of interest), and being a smoker is not on the causal path between lighters and lung cancer (meaning carrying lighters do not cause you to be a smoker). Thus, smoking status is a confounder of the relationship between lighters and lung cancer. As a result, we would need to adjust for smoking status in the regression model - and after adjusting for it, there will be no association between lighters and lung cancer. So in studying humans, the primary hurdle of all epidemiologic studies, is mitigating the effect of confounders so that we have confidence in the robustness of the relationship observed between an exposure and outcome.

So with PSM, the first step is to identify everything associated with the exposure (where you regress all variables on the exposure) and generate a predicted probability of having the exposure, this probability becomes the weights in the regression between exposure and outcome (here is a general overview: https://rmdopen.bmj.com/content/5/1/e000953). Many times, people throw everything plus the kitchen sink in the model predicting the exposure to generate weights - this is the biggest concern with PSM, because some of those variables could be colliders (downstream of both the exposure and outcome) that when adjusted for it creates a false relationship between the exposure and outcome. Others argue that because so many things are use to predict the exposure in the first step that the effects of colliders are overwhelmed and washed out (see this discussion thread: survival - Adjust for everything you have in propensity score? - Cross Validated). So from my soap-box, I caution you to be thoughtful in what variables are used to predict the exposure from which the weights are then generated.

I like this overview of causality and PSM as it touches on collider bias in plan language: https://osf.io/preprints/socarxiv/ncvqs/download

I also like this summary: Using Propensity Scores for Causal Inference: Pitfalls and Tips - PMC

Hope this helps!

3 Likes

Thank you so much @fbbriggs for this thorough and detailed response. I’ll certainly review the material you posted and approach these methods with caution. I appreciate your thoughts!

2 Likes

Hi Farren!

Thanks a bunch for the generous offer—I’m all in, haha!

I’m really interested by anxiety and depression in PD and eager to dive deeper into understanding the factors that might influence them in both patients and controls.

Right now, I’m delving into structural equation models, and so far I find them helpful in grasping a portion of the patient scenario by incorporating various variables like age at diagnosis and cognition.

However, I’m a bit uncertain if they’d prove as useful in distinguishing differences with controls. Any insights on that? Or perhaps you could recommend other models or approaches worth exploring?

Looking forward to your thoughts! Thanks

2 Likes

Are you using SEM to explore specific mediation/moderation relationships? Or more along the lines of latent variables? I guess my question is what is your research question/study design :slight_smile: ?