Performs a Kolmogorov-Smirnoff two-sample test (S.J. Welham, N.M. Maclaren & H.R. Simpson).
Options
PRINT = string tokens |
Output required (test , differences , ranks ): test gives the test statistic, differences gives signed differences, and ranks produces the ranks for each sample; default test |
---|---|
GROUPS = factor |
Defines the groups for a two-sample test if only the Y1 parameter is specified |
Parameters
Y1 = variates |
Identifier of the variate holding the first sample |
---|---|
Y2 = variates |
Identifier of the variate holding the second sample |
R1 = variates |
Saves the ranks of the first sample |
R2 = variates |
Saves the ranks of the second sample |
STATISTIC = scalars |
Scalar to save the test statistic (the maximum absolute difference between the cumulative distribution functions) |
CHISQUARE = scalars |
Scalar to save the chi-square approximation to the test statistic |
DIFFERENCES = variates |
Variate to save the signed differences between the cumulative distribution functions |
Description
The Kolmogorov-Smirnoff test assesses the similarity between the underlying distributions of the two samples, by comparing their cumulative distribution functions; the test statistic is the maximum absolute difference between the cumulative distribution functions. The samples can either be specified in two separate variates using the parameters Y1
and Y2
. Alternatively, they can be given in a single variate, with the GROUPS
option set to a factor to identify the samples. The GROUPS
option is ignored when the Y2
parameter is set.
Output from the procedure is controlled by the PRINT
option: test
prints the relevant test statistic, differences
prints the signed differences, and ranks
prints a vector of ranks for each of the samples.
The test statistic and its chi-square approximation can be saved using the parameters STATISTIC
and CHISQUARE
respectively. The parameter DIFFERENCES
can be used to save the differences between the cumulative distributions. The R1
and R2
parameters allow the ranks of the samples to be saved.
Options: PRINT
, GROUPS
.
Parameters: Y1
, Y2
, R1
, R2
, STATISTIC
, CHISQUARE
, DIFFERENCES
.
Method
The Kolmogorov-Smirnoff two sample test is a test of the null hypothesis that the two samples arise from the same distribution, against the alternative that the underlying distributions are different. The test compares the two empirical cumulative distribution functions in order to try and detect differences in shape of the underlying distributions. The cumulative distribution functions S1 and S2 are formed by
Sk(X) = ( number of scores in sample k ≤ X ) / ( size of sample k )
for k=1,2; and a suitable set of points X. The procedure uses the set of values taken by one or other of the samples, i.e. {X: X is in DATA
}. The maximum absolute difference
MD = max( abs { S1(X) – S2(X) } )
is used as the basis for significance tests. The chi-square approximation (2 degrees of freedom) to this statistic is CH:
CH = 4 × MD × MD × (n1×n2 / (n1+n2) )
where n1, n2 are the sizes of the samples. (See for example Siegel 1956, pages 127-136.)
Action with RESTRICT
The variates Y1
and Y2
can be restricted, and in different ways. KOLMOG2
uses only those units of each variate that are not excluded by their respective restrictions. Restrictions are also obeyed on Y1
and GROUPS
, allowing RESTRICT
to be used for example to limit the data to only two groups when the GROUPS
factor has more than two levels.
Reference
Siegel, S. (1956). Nonparametric Statistics for the Behavioural Sciences. McGraw-Hill, New York.
See also
Directive: DISTRIBUTION
.
Procedures: DPROBABILITY
, EDFTEST
.
Commands for: Basic and nonparametric statistics.
Example
CAPTION 'KOLMOG2 example',!t(\ 'Data from Siegel (1956), Nonparametric Statistics,',\ 'p. 133. Two groups are scored by the number of a set of 18',\ 'objects that they can identify.'); STYLE=meta,plain VARIATE [VALUES=11(0),7(3),8(6),3(9),5(12),5(15),5(18)] N1 & [VALUES=0,3(3),6(6),12(9),12(12),4(15),16(18)] N2 PRINT [ORIENT=across] N1,N2; DECIMALS=0; FIELD=6 CAPTION !T('The Kolmogorov-Smirnoff test is used to test whether the two',\ 'sets of results come from the same underlying distribution.') KOLMOG2 [PRINT=test,differences,ranks] Y1=N1; Y2=N2; R1=RN1; R2=RN2;\ STATISTIC=M; CHISQUARE=Chi2; DIFFERENCES=Diffs PRINT M,Chi2 & [ORIENT=across] RN1,RN2,Diffs; DECIMALS=1; FIELD=6