Confounding, mediation and effect modification
An advanced guide explaining three important ideas in observational research: confounding, mediation and effect modification, with examples, interpretation and common mistakes.
Structure
Problem, intuition, method, working, limitations and discussion.
Best for
Students preparing for coursework, analysis, interpretation or revision.
Use with
Learning Hub lessons, tutoring sessions or dissertation planning.
Resource guide
Problem
Students often include extra variables in regression models without understanding why. Some variables should be adjusted for because they confound the exposure-outcome relationship. Some variables should not be adjusted for because they lie on the causal pathway. Some variables change the size of the effect across subgroups and should be investigated as effect modifiers. Confounding, mediation and effect modification answer different questions, but they are often mixed together.
- Confounders are added to models without justification.
- Mediators are adjusted for when the total effect is the target.
- Effect modification is ignored even when subgroup effects may differ.
- Students treat every third variable as a confounder.
- Adjustment is confused with explanation.
- Causal diagrams are not used to guide model decisions.
- Regression models become statistically complex but conceptually unclear.
Resource guide
Intuition
Confounding is a mixing problem. A confounder is related to both the exposure and the outcome and can distort the apparent association. Mediation is a mechanism problem. A mediator explains part of how an exposure affects an outcome. Effect modification is a heterogeneity problem. It means the exposure-outcome association differs across levels of another variable.
- Confounding asks: is the observed association distorted by a common cause?
- Mediation asks: does the exposure affect the outcome through an intermediate variable?
- Effect modification asks: is the association different in different groups?
- A confounder should usually be adjusted for when estimating an exposure effect.
- A mediator should not be adjusted for if the aim is the total effect.
- An effect modifier may require interaction terms or stratified reporting.
Resource guide
Method
The correct approach begins with the research question and a causal understanding of the variables. A directed acyclic graph, or DAG, can help clarify which variables are confounders, mediators or colliders. The analyst should not decide adjustment variables only by statistical significance. Instead, variable selection should be guided by study design, subject knowledge and the causal question.
- Step 1: Define the exposure, outcome and target effect.
- Step 2: Draw a causal diagram if possible.
- Step 3: Identify common causes of the exposure and outcome as potential confounders.
- Step 4: Identify variables that occur after the exposure as possible mediators.
- Step 5: Avoid adjusting for mediators when estimating the total effect.
- Step 6: Consider effect modification when the effect may differ across groups.
- Step 7: Use interaction terms or stratified estimates when investigating modification.
- Step 8: Report adjusted and unadjusted estimates where useful.
- Step 9: Explain the reason for adjustment variables.
- Step 10: Interpret results according to the causal question.
Resource guide
Working
Suppose a study asks whether smoking is associated with lung cancer. Age may confound the association if age is related to smoking patterns and lung cancer risk. Tar deposition or lung damage may be mediators if they lie on the pathway from smoking to cancer. Sex may be an effect modifier if the smoking-cancer association differs between males and females. These variables should not all be treated in the same way.
- Exposure: smoking.
- Outcome: lung cancer.
- Potential confounder: age, if related to both smoking and lung cancer.
- Potential mediator: biological damage caused by smoking.
- Potential effect modifier: sex, if the smoking effect differs by sex.
- To estimate the total effect of smoking, avoid adjusting for mediators.
- To assess effect modification, compare effects across strata or include an interaction.
- To reduce confounding, adjust for common causes of exposure and outcome.
Resource guide
Limitations
Confounding, mediation and effect modification are conceptual ideas, not just statistical procedures. Regression cannot automatically identify which role a variable plays. Incorrect adjustment can increase bias, especially when adjusting for colliders or mediators. Mediation analysis often requires stronger assumptions than ordinary regression and should be interpreted carefully.
- A variable's role depends on the research question.
- The same variable can play different roles in different studies.
- Adjusting for a mediator can remove part of the effect of interest.
- Adjusting for a collider can introduce bias.
- Unmeasured confounding can remain after adjustment.
- Interaction terms can be underpowered in small samples.
- Mediation analysis requires careful temporal and causal assumptions.
Resource guide
Discussion
A strong report should explain why variables were included in the model. It should state whether the aim was to reduce confounding, investigate pathways or examine differences across subgroups. This prevents the analysis from looking like a mechanical regression exercise and shows that the student understands the scientific question behind the model.
- Explain adjustment variables using subject knowledge.
- Avoid saying a model is adjusted without saying what for and why.
- Separate confounding control from mediation analysis.
- Report effect modification clearly if subgroup effects differ.
- Avoid overinterpreting subgroup analyses from small samples.
- Use cautious language when causal assumptions are uncertain.
- Connect model decisions back to the research question.
Practical checklist
Before you apply this topic
- Have you defined the exposure?
- Have you defined the outcome?
- Have you stated the target effect?
- Have you identified possible confounders using subject knowledge?
- Have you avoided choosing confounders only by p-values?
- Have you identified possible mediators?
- Have you avoided adjusting for mediators when estimating total effects?
- Have you considered possible effect modification?
- Have you justified interaction or stratified analysis?
- Have you avoided adjusting for colliders?
- Have you explained the adjustment strategy clearly?
- Have you discussed unmeasured confounding?
Common mistakes
What to avoid
- Calling every covariate a confounder.
- Adjusting for variables only because they are statistically significant.
- Adjusting for mediators without realising the effect estimate changes meaning.
- Ignoring effect modification.
- Using subgroup analysis without explaining why.
- Interpreting interaction terms without context.
- Assuming adjustment proves causation.
- Failing to distinguish total and direct effects.
- Ignoring unmeasured confounding.
- Using regression as a substitute for causal thinking.
How this connects to learning
Use the guide as a bridge between theory and application.
A resource guide should not replace a full course or live teaching session. Instead, it helps you organise your thinking. Use it to identify what you understand, what feels unclear, and what questions you should ask before applying a method to real data.
Before a lesson
Read the intuition and problem sections to prepare.
During analysis
Use the method and checklist to guide decisions.
When writing
Use limitations and discussion to improve interpretation.
Related guides
