Kruskal-Wallis test
Kruskal-Wallis test
The Kruskal-Wallis test is a generalized U-test for more than two groups. It tests H0 that data from k populations are not different.
Requirements:
Data must be ordinal (rank-order) scaled. Distribution is free
Idea:
The test works like the Mann-Whintey U-test. The data from all groups are brought together in one rank order. For each group the sum of ranks Ti and mean rank is then computed. Whereas the total sum of ranks is:
with
k = number of groups
N = Total Number of measurements
The test value H is computed as follows:
whereas
ni = sample size of group i
H is Chi-Square distributed with k-1 degrees of freedom
If there are tied ranks H is corrected as follows:
whereas
ti = Number of subjects sharing rank i
p = number of tied ranks
and
Post-hoc analysis
If the Kruskal-Wallis test is significant one probably wants to know which of the groups are different. BrightStat offers two different methods for post-hoc analysis:
The critical difference of the mean ranks after Conover (1971, 1980, 1999):
whereas
= critical difference of mean ranks of group i and j
= critical t-value with N-k degrees of freedom
ni = sample size of group i
nj = sample size of group j
The critical difference of the mean ranks after Schaich and Hamerle (1984):
whereas
= critical difference of mean ranks of group i and j
= critical Chi-Square-value with k-1 degrees of freedom
ni = sample size of group i
nj = sample size of group j
The method after Schaich and Hamerle is exact but lacks a bit of power, whereas the method of Conover is approximative and more liberal.
Example of a Kruskal-Wallis test
A meteorologist has measured the amount of rain in four cities for six months. She wants to know if there are different amounts of rain in the four cities. The following table shows the raw data:
Cities | |||
1 |
RANK | 2 |
RANK |
68 |
8 |
119 |
22 |
93 |
16 |
116 |
21 |
123 |
24 |
101 |
17 |
83 |
14 |
103 |
18 |
108 |
19 |
113 |
20 |
122 |
23 |
84 |
15 |
SUM |
104 |
113 |
|
MEAN |
17.33 |
18.83 |
Cities | |||
3 |
RANK |
4 |
RANK |
70 |
10.5 |
61 |
5 |
68 |
8 |
54 |
1.5 |
54 |
1.5 |
59 |
3.5 |
73 |
12 |
67 |
6 |
81 |
13 |
59 |
3.5 |
68 |
8 |
70 |
10.5 |
SUM |
53 |
30 |
|
MEAN |
8.83 |
5 |
H is then computed as follows:
Because there are tied ranks H is corrected
the corrected H’ is then
The critical 5% Chi-Square with 3 degrees of freedom is 7.81
The observed test-value is greater than the critical Chi-Square, so there must be some differences in the amount of rain between the four cities.
We might be interested in the critical difference of the mean ranks so we can check which cities are different from each other. Because each group has 6 measurements we get one critical difference for all comparisons:
After Conover we get
and after Schaich and Hamerle we get
we can now compare the differences of the group mean ranks with the two critical differences:
CITY_1 |
CITY_2 |
CITY_3 |
|
CITY_2 |
-1.5 |
- |
- |
CITY_3 |
8.5 * |
10 * |
- |
CITY_4 |
12.33 * ° |
13.83 * ° |
3.83 |
* significant difference after Conover
° significant difference after Schaich and Hamerle
BrightStat output of Kruskal-Wallis test example
This is a fictitious example.
How to do this example on BrightStat webapp
Wiki link Kruskal-Wallis test
References
Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6th Edition). Heidelberg: Springer Medizin Verlag.
Conover, W.J. (1999). Practical nonparametric Statistics.(3rd edition). Wiley.
Kruskal, W.H. & Wallis, W.A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47 (260), 583 – 621.
Schaich, H.E. & Hamerle, A. (1984). Verteilungsfreie statistische Prüfverfahren, Berlin.