1. Home
  2. KRUSKAL procedure

KRUSKAL procedure

Carries out a Kruskal-Wallis one-way analysis of variance (S.J. Welham, N.M. Maclaren & H.R. Simpson).

Options

PRINT = string tokens Output required (test, ranks): test produces the relevant test statistics, ranks produces a vector of ranks for each sample relative to the whole data set; default test
GROUPS = factor Defines the sample membership if only one variate is specified by DATA
STATISTIC = scalar Scalar to save the Kruskal-Wallis test statistic
MEANRANKS = variate Variate to save the mean ranks of the samples
DF = scalar Scalar to save the degrees of freedom for the statistic

Parameters

DATA = variates List of variates containing the data for each sample, or a single variate containing the data from all the samples (the GROUPS option must then be set to indicate the sample to which each unit belongs)
RANKS = variates Allow the ranks to be saved (relative to the combined data)

Description

KRUSKAL carries out a Kruskal-Wallis one-way analysis of variance on the ranks (relative to the whole data set) of a set of samples. The samples can be stored in different variates and supplied as a list in the DATA pointer. Alternatively, they can all be placed in a single variate, and the GROUPS option set to a factor to indicate the sample to which each unit belongs. Output from the procedure is controlled by the PRINT option: test (the default setting) prints the relevant test statistics, and ranks prints the vector of ranks for each sample.

The test statistic, vector of mean ranks and degrees of freedom can be saved using the STATISTIC, MEANRANKS and DF options, respectively. Parameter RANKS can be set to a variate, or variates, to store the ranks of the data relative to the whole data set.

Options: PRINT, GROUPS, STATISTIC, MEANRANKS, DF.

Parameters: DATA, RANKS.

Method

The Kruskal-Wallis One-Way Analysis of Variance is used to test the hypothesis that several (K) samples come from distributions with the same mean. The test statistic H, is formed by ranking the combined data set, then considering the sum of these ranks within each sample:

H = [ (12 / N×(N+1)) × ∑j=1…K { Rj×Rj/nj } ] – 3×(N+1)

where Rj is the sum of ranks for the jth sample,

nj is the size of the jth sample, and

N is the size of the combined data set.

If ties are present in the data, then an adjustment to the statistic H is required:

adjusted H = H /( 1 – ∑k { tk3tk }/(N3N) )

where tk is the number of observations with rank k. (See for example Siegel 1956, pages 184-193.)

When there are at least five cases in each of the samples, H has approximately a Chi-square distribution on K-1 degrees of freedom. When this condition is not satisfied, and there are three samples, KRUSKAL uses a table of calculated values of the distribution of the statistic.

Action with RESTRICT

The variates in DATA can be restricted, and in different ways. KRUSKAL uses only those units of each variate that are not excluded by their respective restrictions.

Reference

Siegel, S. (1956). Nonparametric Statistics for the Behavioural Sciences. McGraw-Hill, New York.

See also

Procedures: AONEWAY, APERMTEST, A2WAY, FRIEDMAN.

Commands for: Basic and nonparametric statistics, Analysis of variance.

Example

CAPTION 'KRUSKAL example',\
        !t('Data from Siegel (1956), Nonparametric Statistics,',\
        'p. 187. Three sets of individuals from different groups are',\
        'selected and receive scores.'); STYLE=meta,plain
VARIATE [VALUES=96,128, 83, 61,101]  To
&       [VALUES=82,124,132,135,109]  Ao
&       [VALUES=115,149,166,147]  Admin
PRINT   To,Ao,Admin; DECIMALS=0; FIELD=7
CAPTION !T('A Kruskal-Wallis Analysis of Variance is performed to test',\
        'for differences in scores between the groups.')
KRUSKAL [STATISTIC=H; MEANRANKS=Meanranks] To,Ao,Admin; RANKS=RTo,RAo,RAdmin
PRINT   H
&       RTo,RAo,RAdmin,Meanranks; DECIMALS=1; FIELD=8
Updated on March 7, 2019

Was this article helpful?