Correlation amidst estimable measurement error
Correlation attempts to measure the association between two measurable properties, like height and weight. However, all the fancy statistics that allow one to draw statistical inferences regarding correlation assume that you’ve obtained perfect measurement of the properties of interest. The real world is fuzzy, so perfect measurement rarely happens. When measuring properties of the mind, it is typical to observe a great deal of random variability from moment to moment. Typically, this variability is averaged-out and subsequently ignored. While many researchers are becoming aware that this variability is important to understand as a property of the mind itself, little attention has been paid to the consequences this variability has for statistical inference. This is despite the fact that it was demonstrated over a hundred years ago by Spearman that correlation coefficients obtained from error-prone measurements will systematically underestimate the true correlation between the properties being correlated.
This page hosts a draft of a paper I’ve submitted to a couple top tier stats journals already (don’t worry, not at the same time!); both liked it but rejected it suggesting resubmission after revision. JRSS:B wanted analytic proof of my approach, CSDA wanted robustness tests for assumption violations. I have some further ideas for improving the work below (ex. I mistakenly approximate the true between-Ss variance from the observed between-Ss variance, when it could be more accurately approximated by other methods; I also suspect resampling might obviate parametric assumptions), but where this began as a side project and where I have relatively little formal statistics training, I’m on the look-out for a co-author able to bring the manuscript to peer-reviewed publication. In the meantime…
To anyone that actually reads that, I apologize for the tabular results; I generally prefer visual presentation of data, but found this difficult with such a large parameter space to describe (a 5×3×3×3 space explored by 4 methods with 3 performance measures). I might try visualization again before the next revision.
The take-home message from this work is that traditional tests against a null-hypothesis of zero correlation are unaffected by measurement error. However, traditional tests comparing two non-zero correlations (as I understand are common in fields like principle component analysis, etc) will be affected such that traditional statistics will be too liberal. An unexpected but neat finding is that increasing the number of participants actually exacerbates the problem, as if the statistics are becoming overconfident. My solution solves these problems… mostly. I think that I can achieve a complete solution if I use a better estimate of the true between-Ss variance, but I’ll have to re-run the simulations to test this theory.
I’ll hopefully post the code for the simulations here, as soon as I make sure it’s presentable.
Addendum: I just read a paper (PDF) that describes application of mixed effects modeling to CAEME. It would be great if MEM provides a quick analytic solution, but I’m surprised that the MEM estimate of correlation can be a different sign than the raw correlation! Hopefully I’ll understand the mechanics of this transformation better after taking a course on MEM.