Next Page Previous Page Handbook Home Tools & Aids Search Handbook
7. Product and Process Comparisons
7.4. Comparisons based on data from more than two processes

7.4.1.

How can we compare several populations with unknown distributions (the Kruskall-Wallis test)?

A nonparametric test for comparing population medians by Kruskall and Wallis The Kruskall-Wallis (KW) Test for Comparing Populations with Unknown Distributions

The KW routine tests the null hypothesis that k samples from possibly different populations actually originate from similar populations, at least as far as their central tendencies, or medians, are concerned. The test assumes that the variables under consideration have underlying continuous distributions. 

In what follows assume we have k samples, and the sample size of the i-th sample is ni, i = 1, 2, . . ., k

In the computation of the KW test, each observation is replaced its rank in an ordered combination of all the k samples. By this we mean that all the data from all the k samples combined, are ranked in a single series. The minimum observation is replaced by a rank of 1, the next to the smallest by a rank of 2, and the largest or maximum observation is replaced by the rank of N, where N is the total number of observations in all the samples (N is the sum of the ni). When several observations are tied, replace them by the average rank. 

The next step is to compute the sum of the ranks for each of the original samples. The KW test determines whether these sums of ranks are so different by sample that they are not likely to have all come from the same population. 

It can be shown that if the k samples come from the same population, that is, if the null hypothesis is true, then the test statistic, H, used in the KW procedure is distributed approximately as a chi square statistic with df = k - 1, provided that the sample sizes of the k samples are not too small (say, ni>4, for all i). H is defined as follows: 

where 
  • k = number of samples (groups) 
  • ni = number of observations for the i-th sample or group 
  • N = total number of observations (sum of all the ni
  • Ri = sum of ranks for group i 
An illustrative example Example

The following data are from a comparison of four investment firms. The observations represent percentage of growth during a three year period for recommended funds. 
A B C D

16 41 32 22
37 39 30 25
21 35 38 33
29 46 28 19
30 40 47 20
43 27

Step 1: Express the data in terms of their ranks 
A B C D

1 19 12 5
15 17 10.5 6
4 14 16 13
9 21 8 2
10.5 18 22 3
20 7

SUM 39.5 109 68.5 36

The corresponding H test-statistic is 

From the chi square critical value table in Chapter 1 the critical value for a = .05 with df = k-1 = 3 is 7.815. Since 12.739 > 7.815, we reject the null hypothesis. 

Note that the rejection region for the KW procedure is one-sided, since we only reject the null hypothesis when the H statistic is too large. 

Handbook Home Tools & Aids Search Handbook Previous Page Next Page