StatisticsIntermediateResource guide

Non-parametric tests: when and how to use them

A detailed guide explaining when non-parametric tests are useful, how they differ from parametric tests, and how to interpret rank-based methods carefully.

Structure

Problem, intuition, method, working, limitations and discussion.

Best for

Students preparing for coursework, analysis, interpretation or revision.

Use with

Learning Hub lessons, tutoring sessions or dissertation planning.

Resource guide

Problem

Students often hear that non-parametric tests should be used whenever data are not normally distributed. This rule is too simple and can lead to poor analysis decisions. Non-parametric tests are not magic replacements for parametric methods. They answer slightly different questions, often use ranks rather than raw values, and may test differences in distributions rather than only differences in means.

Normality is treated as the only reason to choose a method.
Students use non-parametric tests automatically for small samples.
Rank-based tests are interpreted as if they compare means.
Paired and independent data structures are confused.
Non-parametric tests are used without considering the research question.
Effect sizes and confidence intervals are often ignored.
Regression alternatives are forgotten when adjustment is needed.

Resource guide

Intuition

Parametric methods often model means and rely on assumptions about the data or residuals. Non-parametric tests usually make fewer distributional assumptions and often work by ranking observations. This makes them useful when data are skewed, ordinal, affected by outliers or difficult to summarise using means. However, the interpretation changes because ranks do not preserve the original measurement scale in the same way.

Non-parametric tests often compare ranks rather than raw values.
They are useful for ordinal outcomes or heavily skewed numerical outcomes.
They can be less sensitive to extreme outliers.
They may test distributional differences, not only mean differences.
They do not automatically solve confounding.
They do not remove the need to understand the study design.

Resource guide

Method

The correct non-parametric test depends on the outcome type, number of groups and whether the observations are independent or paired. Students should first define the research question and data structure. Then they should decide whether a rank-based method is appropriate or whether transformation, robust methods or regression would answer the question better.

Step 1: Define the research question clearly.
Step 2: Identify the outcome variable.
Step 3: Decide whether the outcome is numerical, ordinal or categorical.
Step 4: Decide whether observations are independent or paired.
Step 5: Count the number of groups being compared.
Step 6: Use Mann-Whitney U or Wilcoxon rank-sum for two independent groups.
Step 7: Use Wilcoxon signed-rank for paired numerical or ordinal data.
Step 8: Use Kruskal-Wallis for more than two independent groups.
Step 9: Use Friedman test for repeated or matched comparisons across more than two conditions.
Step 10: Report the result with an explanation of what was compared.

Resource guide

Working

Suppose a student compares pain scores between two independent treatment groups. Pain score may be ordinal or skewed. A Mann-Whitney U test may be more suitable than an independent t-test if the goal is to compare the distribution or typical ranking of pain scores. However, if the research question asks about adjusted differences after controlling for age and baseline severity, a regression model may be more appropriate.

Two independent groups with skewed numerical or ordinal outcome: Mann-Whitney U test.
Two paired measurements: Wilcoxon signed-rank test.
More than two independent groups: Kruskal-Wallis test.
More than two repeated measurements: Friedman test.
Categorical outcome with small cell counts: Fisher's exact test may be needed instead.
Adjusted analysis with covariates: consider regression rather than simple non-parametric tests.

Resource guide

Limitations

Non-parametric tests are useful but limited. They may have less power than parametric tests when parametric assumptions are reasonable. They also do not provide adjusted estimates in the same way regression does. Their results can be harder to interpret because they often concern ranks, medians or distributional differences rather than means.

They do not automatically compare medians in all situations.
They can lose information by converting values into ranks.
They do not adjust for confounders in simple test form.
They may be less powerful when parametric assumptions are reasonable.
They still require independent or paired structure to be correct.
They can be difficult to translate into practical effect sizes.
They should not be chosen only because a normality test is significant.

Resource guide

Discussion

A strong report should explain why a non-parametric method was used and what it tests. Students should avoid saying simply that the data were non-normal. They should mention the outcome type, skewness or ordinal nature of the data, sample size, data structure and the interpretation of the result. Where possible, descriptive summaries such as medians and interquartile ranges should accompany the test.

Explain the reason for using a non-parametric method.
Report medians and interquartile ranges where suitable.
Avoid interpreting rank-based tests as simple mean comparisons.
Mention whether observations were independent or paired.
Use post-hoc comparisons carefully after Kruskal-Wallis or Friedman tests.
Consider regression methods when adjustment is needed.

Practical checklist

Before you apply this topic

Have you defined the research question?
Is the outcome numerical or ordinal?
Are observations independent or paired?
How many groups are being compared?
Is the method chosen because of data structure, not only normality testing?
Have you considered whether regression is needed?
Have you reported suitable descriptive statistics?
Have you interpreted ranks or medians carefully?
Have you avoided claiming the test compares means?
Have you reported limitations clearly?

Common mistakes

What to avoid

Using non-parametric tests only because a normality test is significant.
Using Mann-Whitney U for paired data.
Using Wilcoxon signed-rank for independent groups.
Interpreting every non-parametric test as a median test.
Ignoring effect size.
Ignoring confidence intervals.
Using simple tests when adjusted regression is required.
Reporting p-values without descriptive summaries.
Choosing Kruskal-Wallis but not planning post-hoc comparisons.
Forgetting that study design matters more than software options.

How this connects to learning

Use the guide as a bridge between theory and application.

A resource guide should not replace a full course or live teaching session. Instead, it helps you organise your thinking. Use it to identify what you understand, what feels unclear, and what questions you should ask before applying a method to real data.

Before a lesson

Read the intuition and problem sections to prepare.

During analysis

Use the method and checklist to guide decisions.

When writing

Use limitations and discussion to improve interpretation.

Related guides

Continue with related topics.

How to choose the correct statistical test

Understanding p-values, confidence intervals and effect sizes

ANOVA, ANCOVA and comparing more than two groups

Common mistakes in dissertation data analysis

How to report regression results in a dissertation

Back to all resources Need help applying this?