Kruskal-Wallis test

The Kruskal-Wallis test is a generalized U-test for more than two groups. It tests H0 that data from k populations are not different.

Requirements:

Data must be ordinal (rank-order) scaled. Distribution is free

Idea:

The test works like the Mann-Whintey U-test. The data from all groups are brought together in one rank order. For each group the sum of ranks T_i and mean rank is then computed. Whereas the total sum of ranks is:

Total sum of ranks

with

k = number of groups

N = Total Number of measurements

The test value H is computed as follows:

formula Kruskal Wallis test

whereas

n_i = sample size of group i

H is Chi-Square distributed with k-1 degrees of freedom

If there are tied ranks H is corrected as follows:

correction for Kruskal Wallis test when tied ranks are present

whereas

t_i = Number of subjects sharing rank i

p = number of tied ranks

and

Kruskal Wallis test when tied ranks are present

Post-hoc analysis

If the Kruskal-Wallis test is significant one probably wants to know which of the groups are different. BrightStat offers two different methods for post-hoc analysis:

The critical difference of the mean ranks after Conover (1971, 1980, 1999):

formula for post-hoc test after Conover for Kruskal-Wallis test

whereas

= critical difference of mean ranks of group i and j

critical t value = critical t-value with N-k degrees of freedom

n_i = sample size of group i

n_j = sample size of group j

The critical difference of the mean ranks after Schaich and Hamerle (1984):

formula for post-hoc test after Schaich and Hamerle for Kruskal-Wallis test

whereas

= critical difference of mean ranks of group i and j

= critical Chi-Square-value with k-1 degrees of freedom

n_i = sample size of group i

n_j = sample size of group j

The method after Schaich and Hamerle is exact but lacks a bit of power, whereas the method of Conover is approximative and more liberal.

Example of a Kruskal-Wallis test

A meteorologist has measured the amount of rain in four cities for six months. She wants to know if there are different amounts of rain in the four cities. The following table shows the raw data:

Cities
1	RANK	2	RANK
68	8	119	22
93	16	116	21
123	24	101	17
83	14	103	18
108	19	113	20
122	23	84	15

SUM	104		113
MEAN	17.33		18.83

Cities
3	RANK	4	RANK
70	10.5	61	5
68	8	54	1.5
54	1.5	59	3.5
73	12	67	6
81	13	59	3.5
68	8	70	10.5

SUM	53		30
MEAN	8.83		5

H is then computed as follows:

formula Kruskal Wallis test

test value

Because there are tied ranks H is corrected

correction for Kruskal Wallis test when tied ranks are present

correction coefficient

the corrected H’ is then

corrected value

The critical 5% Chi-Square with 3 degrees of freedom is 7.81

The observed test-value is greater than the critical Chi-Square, so there must be some differences in the amount of rain between the four cities.

We might be interested in the critical difference of the mean ranks so we can check which cities are different from each other. Because each group has 6 measurements we get one critical difference for all comparisons:

After Conover we get

critical difference after Conover

and after Schaich and Hamerle we get

critical difference after Schaich and Hamerle

we can now compare the differences of the group mean ranks with the two critical differences:

	CITY_1	CITY_2	CITY_3
CITY_2	-1.5	-	-
CITY_3	8.5 *	10 *	-
CITY_4	12.33 * °	13.83 * °	3.83

* significant difference after Conover

° significant difference after Schaich and Hamerle

BrightStat output of Kruskal-Wallis test example

This is a fictitious example.

How to do this example on BrightStat webapp

Wiki link Kruskal-Wallis test

References

Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6^th Edition). Heidelberg: Springer Medizin Verlag.

Conover, W.J. (1999). Practical nonparametric Statistics.(3^rd edition). Wiley.

Kruskal, W.H. & Wallis, W.A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47 (260), 583 – 621.

Schaich, H.E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren, Berlin.