Social Science Statistics

What Is the Kruskal-Wallis Test?

The Kruskal-Wallis test is a non-parametric method for comparing three or more independent groups. It serves the same purpose as a one-way analysis of variance (ANOVA) — testing whether at least one group differs from the others — but it does not assume that the data within each group follow a normal distribution. Instead of comparing group means directly, it works with the ranks of the observations, making it a natural choice when your data are ordinal (such as survey ratings) or when the distributions are skewed or contain outliers.

Why Do We Need It?

Imagine a researcher studying customer satisfaction across three different coffee shops. She asks 60 customers — 20 from each shop — to rate their experience on a scale from 1 to 10. A one-way ANOVA could compare the average ratings, but satisfaction ratings are ordinal and often skewed (many people cluster at the high end, for example). The ANOVA assumes that the data in each group are normally distributed and that the groups have similar variances. If these assumptions are not met, the ANOVA's p-value may be unreliable.

The Kruskal-Wallis test avoids these problems. Because it ranks all the observations from lowest to highest regardless of group membership, it is robust to non-normal distributions and is appropriate for ordinal data. It asks a slightly different question than the ANOVA: not "are the means different?" but "are the distributions of ranks different across groups?" In practice, this usually amounts to the same thing — groups with higher scores will tend to hold higher ranks.

How Does It Work?

The procedure begins by pooling all observations from every group into a single list and ranking them from 1 (the smallest value) up to N (the total number of observations). When ties occur, the tied values receive the average of the ranks they would have occupied. Once every observation has a rank, the test calculates the average rank within each group. If there is no real difference between the groups, these average ranks should be roughly equal, each hovering near the overall average rank.

The test statistic, called H, measures how much the group average ranks deviate from what we would expect under the assumption of no difference. Mathematically, H is a function of the squared differences between each group's average rank and the overall average rank, weighted by the group sizes. When samples are not too small, H approximately follows a chi-square distribution with degrees of freedom equal to the number of groups minus one. This chi-square approximation is what produces the p-value you see in most statistical software.

What Does the Result Mean?

A small p-value (typically below 0.05) tells you that at least one group's distribution of ranks differs significantly from the others. Crucially, it does not tell you which group or groups are different. To pinpoint where the differences lie, you would follow up with post-hoc pairwise comparisons — for example, a series of Mann-Whitney U tests with a correction for multiple comparisons (such as the Bonferroni correction). A non-significant result means the data do not provide enough evidence to conclude that the groups differ.

Key Assumptions

The Kruskal-Wallis test is relatively assumption-free, but a few conditions should be met:

Independent groups: The observations in each group must come from different individuals. If the same people are measured across conditions, you need a repeated-measures test such as the Friedman test.
Independent observations: Within each group, one person's score should not influence another's.
Ordinal or continuous data: The outcome variable must be at least ordinal, so that ranking the values is meaningful.
Similar distribution shapes: Strictly speaking, the test assumes that the distributions in each group have the same shape (though they may be shifted). If the shapes differ markedly, a significant result could reflect differences in spread rather than differences in location.

When to Use It

Choose the Kruskal-Wallis test when you want to compare three or more independent groups and your data do not meet the assumptions of a one-way ANOVA. It is especially useful with small samples, ordinal outcome variables, or data that are clearly non-normal. If your data are approximately normal and the group variances are similar, the one-way ANOVA will generally have slightly more statistical power, so there is no need to switch to a non-parametric test in that case. If you have only two independent groups, the Mann-Whitney U test (which is essentially the Kruskal-Wallis test applied to two groups) is the conventional choice.

A Quick Example

A health researcher measures pain relief scores for patients randomly assigned to one of three treatments: a standard drug, an experimental drug, and a placebo. Each group has 15 patients. Because pain scores are ordinal and skewed, she uses the Kruskal-Wallis test. After ranking all 45 scores, she calculates H = 9.47 with 2 degrees of freedom, yielding a p-value of 0.009. Because this is below 0.05, she concludes that at least one treatment group differs significantly from the others. She then runs pairwise Mann-Whitney U tests and discovers that the experimental drug group has significantly higher pain relief than both the standard drug and the placebo groups.

The Kruskal-Wallis test is a versatile, rank-based alternative to the one-way ANOVA. It is easy to apply, makes few assumptions about your data, and is widely used across the social, behavioural, and health sciences whenever normality cannot be guaranteed.