Rating scales caution in study design and analysis

Some of us recently chatted about how there are efforts for using the same rating scales in different regions and interpreting that data. Wanted to share a couple insights that might help with analyzing such data, or designing your study with such scales that are used widely. Not every scale fits every population, cultural adaptations are important. The animals on the MoCA are not typical animals found in every single region for instance, or the letters for verbal fluency are not the best letters for each language to derive words with. At times there are culturally adapted and validated versions of the scales that should be used, but in that case you need to be careful when interpreting that data. For instance, MoCA cut off for impairment is different across countries, it’s 21 in Turkey, so you can’t treat someone from Turkey with a MoCA of 23 as impaired, or a decline from 26 to 24 as a cross over to the impairment zone. As people age, they just become slower and have more dopaminergic loss so DaTscan norms change, UPDRS may no longer be 0. So you have to pay attention on a participant level when you’re doing group analysis, which may be tough if you have 100s of participants of course, but is important to more reliably interpret your findings. We just throw these in as limitations to the papers, but if we want more applicable research findings to real life settings, we need to be mindful of these shortcomings and design analyses accordingly. I’m sure you can also point out more points to pay attention to, can you share so we can gather a nice list for caution that we can use in our studies?

3 Likes

This is such an important point that is constantly ignored, but for objective measures and especially for patient-reported outcomes. We have done from preliminary work on bias/invariance in a common suite of multiple sclerosis PROs and there was evidence for bias (meaning the PRO performs differently) as a function of age, sex, race, and SES measures…. without re-calibrating PROs and objective tools like MoCA - we are likely comparing oranges to grapefruits to tangerines instead of oranges to oranges when comparing different populations. Sigh

3 Likes

metallica-sad-but-true

even if you think about objective measures, each lab has their own normative range because things change based on how things are collected, processed etc. like let’s say normal datscan values, you have sex, age, etc. to consider! we really need to be careful combining different datasets, and have to be really on top of the methods to make sure we’re interpreting things properly.

4 Likes