StatisticsFoundationResource guide

Understanding p-values, confidence intervals and effect sizes

A detailed guide explaining statistical significance, uncertainty, effect size, practical importance and how students should interpret results responsibly.

Structure

Problem, intuition, method, working, limitations and discussion.

Best for

Students preparing for coursework, analysis, interpretation or revision.

Use with

Learning Hub lessons, tutoring sessions or dissertation planning.

Resource guide

Problem

Students often learn p-values before they understand what statistical evidence really means. As a result, many reports reduce analysis to a simple rule: if p is less than 0.05, the result is important; if p is greater than 0.05, the result is not important. This is too simplistic. A p-value is only one part of interpretation. Good statistical reporting also considers the estimated effect, the confidence interval, the sample size, uncertainty, assumptions and practical meaning.

Students often treat p < 0.05 as proof that an effect is real.
Students often treat p > 0.05 as proof that there is no effect.
Confidence intervals are reported but not interpreted.
Effect sizes are ignored even when they are more informative than p-values.
Statistical significance is confused with practical or clinical importance.
Large samples can make tiny effects statistically significant.
Small samples can fail to detect important effects.

Resource guide

Intuition

A p-value tells us how surprising the observed data would be if a particular null hypothesis were true. A confidence interval gives a range of plausible values for the effect. An effect size tells us how large the difference, association or change is. These three ideas answer different questions. The p-value asks about compatibility with a null hypothesis. The confidence interval shows uncertainty. The effect size describes magnitude.

The p-value helps assess evidence against a null hypothesis.
The confidence interval shows uncertainty around the estimate.
The effect size tells us the size of the difference or association.
A small p-value does not automatically mean a large or important effect.
A wide confidence interval means the estimate is imprecise.
A narrow confidence interval usually means the estimate is more precise.

Resource guide

Method

When interpreting statistical results, students should follow a structured process. First, identify the estimated effect. Second, look at the confidence interval to understand uncertainty. Third, interpret the p-value as evidence against the null hypothesis, not as proof. Fourth, consider whether the result is meaningful in the real-world context. Fifth, discuss limitations such as sample size, bias, confounding and assumptions.

Step 1: Identify what effect is being estimated.
Step 2: Interpret the direction of the effect.
Step 3: Interpret the size of the effect.
Step 4: Examine the confidence interval.
Step 5: Check whether the confidence interval includes the null value.
Step 6: Interpret the p-value carefully.
Step 7: Decide whether the effect is practically, clinically or academically meaningful.
Step 8: Discuss uncertainty and limitations.

Resource guide

Working

Suppose a study compares mean exam scores between two teaching methods. The estimated mean difference is 4.5 marks, with a 95% confidence interval from 1.2 to 7.8 and a p-value of 0.008. A weak interpretation is: 'The result is significant.' A better interpretation is: 'Students in the new teaching method scored on average 4.5 marks higher than students in the standard method. The 95% confidence interval suggests the true mean difference may plausibly be between 1.2 and 7.8 marks. The p-value provides evidence against no difference, but the practical importance depends on whether a difference of this size matters educationally.'

For a mean difference, the effect size is the difference between group means.
For a risk ratio, the effect size describes relative risk.
For an odds ratio, the effect size describes the ratio of odds.
For a correlation, the effect size describes strength and direction of association.
For regression, the coefficient describes the expected change in outcome per unit change in predictor.
The confidence interval gives the plausible range of the effect.
The p-value helps assess evidence against the null hypothesis.

Resource guide

Limitations

P-values, confidence intervals and effect sizes are powerful tools, but they can be misused. A p-value depends on the sample size, variability, model assumptions and analysis plan. Confidence intervals can be misleading if assumptions are violated or if the analysis ignores bias. Effect sizes can be statistically precise but still not meaningful in practice. Interpretation should therefore combine statistical evidence with subject knowledge.

A p-value is not the probability that the null hypothesis is true.
A p-value is not the probability that the result happened by chance.
A confidence interval is not a guarantee that the true value lies inside the interval for this specific study.
A statistically significant result may be too small to matter in practice.
A non-significant result may still be compatible with an important effect.
Multiple testing can increase the chance of false positive findings.
Bias and confounding cannot be solved by a small p-value.

Resource guide

Discussion

Strong statistical interpretation should avoid mechanical phrases such as 'significant' and 'not significant' without explanation. A better discussion describes the estimated effect, uncertainty, statistical evidence and practical meaning. Students should write interpretations that a reader can understand without only looking at the p-value. This is especially important in dissertations, health research and applied data analysis, where decisions depend on magnitude and uncertainty, not just statistical significance.

Report the estimate and confidence interval before the p-value.
Explain what the effect means in context.
Avoid saying that p > 0.05 proves no effect.
Avoid saying that p < 0.05 proves the hypothesis is true.
Discuss practical or clinical importance separately from statistical significance.
Mention uncertainty clearly.
Connect interpretation back to the research question.

Practical checklist

Before you apply this topic

Have you identified the effect being estimated?
Have you interpreted the direction of the effect?
Have you interpreted the magnitude of the effect?
Have you reported a confidence interval?
Have you explained what the confidence interval means in context?
Have you interpreted the p-value without exaggerating it?
Have you considered whether the effect is practically meaningful?
Have you considered sample size and precision?
Have you avoided saying that non-significant means no effect?
Have you avoided reporting only p-values?
Have you discussed uncertainty and limitations?
Have you linked the result back to the research question?

Common mistakes

What to avoid

Writing only 'p < 0.05, therefore significant' without interpretation.
Treating p-values as proof.
Treating non-significant results as evidence of no effect.
Ignoring confidence intervals.
Reporting confidence intervals without explaining them.
Ignoring effect sizes.
Confusing statistical significance with practical importance.
Using stars or thresholds instead of clear explanation.
Not considering sample size.
Not discussing limitations, bias or assumptions.

How this connects to learning

Use the guide as a bridge between theory and application.

A resource guide should not replace a full course or live teaching session. Instead, it helps you organise your thinking. Use it to identify what you understand, what feels unclear, and what questions you should ask before applying a method to real data.

Before a lesson

Read the intuition and problem sections to prepare.

During analysis

Use the method and checklist to guide decisions.

When writing

Use limitations and discussion to improve interpretation.

Related guides

Continue with related topics.

How to choose the correct statistical test

How to prepare your data before analysis

Choosing between correlation and regression

Linear regression assumptions and diagnostics

How to report regression results in a dissertation

Sample size, power and precision explained

Back to all resources Need help applying this?