*Geek Box: Correlation, Confounding, & Control

*Geek Box: Confounders, Correlations, and Control

There is a tendency when it comes to interpreting epidemiological findings to dismiss any related factor between an exposure and and outcome as a “confounder”. This ranges from over simplistic, to plain wrong.

In order to be a confounder, a variable has to be associated with the exposure but not caused by it, and independently associated with the outcome. For example, an analysis looks at coffee as the exposure and heart disease as the outcome, and finds a strong association; but high coffee drinkers in the study are also heavy smokers. Coffee does not cause smoking, but they are related behaviours. In this case, controlling for smoking means the relationship between coffee and heart disease is not longer evident, i.e., smoking was the confounder.

Many of the factors that we deem ‘confounders’ may in fact only be correlated behaviours or variables. The question to ask from a nutritional perspective is whether the dietary association is independent of non-dietary related lifestyle factors. We can determine this through appropriate control of known variables which may be correlated with diet, like socio-economic status, alcohol intake, or BMI. These variables are not inherently confounders; it depends on what the exposure-outcome relationship is that we’re looking at.

A common misconception reading such a list of covariates is to assume that all are confounders, however, this is incorrect; there are distinct differences between confounders [i.e., smoking], and moderating or mediating factors [i.e., fibre, fruit]. A general lack of understanding for the differences between such variables in widespread in discourse surrounding nutritional epidemiology.

A fundamental difference is that a confounder may have direct relationship with the outcome, while a moderating factor may influence the size of the effect and the full operation of a cause-effect relationship, however, it does not invalidate that a relationship exists between the exposure and the outcome.

If these were true confounders, then once they are adjusted for in the statistical analysis, the exposure-outcome relationship would no longer be evident. If the exposure-outcome relationship survives this adjustment, then it indicates that the effect of the exposure on the outcome is independent of these related non-dietary variables. While the caveat of epidemiology is always that “residual confounding cannot be ruled out”, in reality it can’t be ruled out in an RCT either, it’s simply that randomisation is deemed to equally distribute unknown variables between an intervention and control group.

The reality is that residual confounding implies there is something we don’t know which could influence the results, which is always true; what is important to remember is that there is a lot we do know, and we can build that into an adjustment model to control for these variables. Remember: correlation does not imply confounding.