Does hierarchical analysis of variance/covariance for unbalanced data (P.W. Lane).
Options
PRINT = string token |
Which analyses to print (all , some , none ); default all |
---|---|
INCHANNEL = scalar |
Channel from which to read data; default * specifies that the data values are already stored in the factors and variates specified by the parameters of HANOVA |
FORMAT = variate |
Format for reading data; default * requests free format |
ANALYSIS = symmetric matrix |
For PRINT=some , this indicates which analyses to print |
SSPM = SSPM |
Stores the corrected sums of squares and products; default * |
COEFFICIENT = matrix |
Stores the estimated variance and co-variance components; default * |
Parameters
VARIATES = pointers |
Variates to be analysed |
---|---|
FACTORS = pointers |
Factors defining the hierarchy, the first factor of the pointer defining the first stratum, and so on |
Description
Procedure HANOVA
performs hierarchical analysis of variance and covariance, estimating the components of variance corresponding to each level of a nested classification. It uses the method of Gower (1962), which is based on the method of moments. This method is less efficient than REML
, and may produce different results. However, it does not require the assumption of Normal distributions for the random terms.
Data are said to be classified hierarchically if the units have several groupings successively nested within each other. One way of representing such a classification would be to identify the groupings in each stratum of the hierarchy by a single factor; two units with the same value for one of the factors would then be required to have the identical values for the factors representing the previous strata. An alternative method is to use not only the factor for the current stratum, but also the factors for previous strata, to indicate the groupings that occur there. For example, the following classifications are effectively equivalent:
(1) | (2) | |||
Unit | Factor 1 | Factor 2 | Factor 1 | Factor 2 |
(stratum 1) | (stratum 2) | (stratum 1) | (stratum 2) | |
1 | 1 | 1 | 1 | 1 |
2 | 1 | 1 | 1 | 1 |
3 | 1 | 2 | 1 | 2 |
4 | 2 | 3 | 2 | 1 |
5 | 2 | 4 | 2 | 2 |
Thus, in the second form of representation, the second factor indicates the sub-divisions within each group in the first stratum, using the same levels each time. This more efficient method is the one required by HANOVA
.
The simplest way to use HANOVA
is to set the VARIATES
parameter to a single variate (or to a pointer if several variates are to be analysed), and set the FACTORS
parameter to a pointer of factors. The factors must be in the order of the hierarchy with the first factor defining the coarsest grouping of the units and succeeding factors being nested within the first. The units of data stored in the variates and factors can be in any order.
Since hierarchical data can often be extensive, HANOVA
can be requested to read the data sequentially, tabulating it with respect to the factors, so that the data need not all be held in core at the same time. The INCHANNEL
defines the channel number of the file from which the data are to be read; if INCHANNEL
is not set, the data are assumed to be present already, in the factors and variates contained in the VARIATES
and FACTORS
parameters. The FORMAT
option allows a variate to be specified for use in the FORMAT
option of the READ
command within the procedure; if this is not set, the default format of READ
is assumed.
If a unit has a missing value for any of the variates or factors, it is omitted from all the analyses. The procedure carries out analyses of variance for specified variates, and of covariance for specified pairs of variates. Variance components are calculated for each stratum: that is, the proportion of the total variance per individual ascribable to the various strata of the classification.
Output is controlled by the PRINT
option: by default, the matrix of coefficients of variance components is printed, followed by an analysis of variance of each variate and of covariance of each pair of variates. To obtain only some of the analyses, option PRINT
should be set to some
, and the ANALYSIS
option to a symmetric matrix with numbers of rows and columns equal to the number of variates. A non-zero value in the matrix indicates that the corresponding analysis of variance or covariance is to be displayed. Printed output can be suppressed by setting PRINT=none
.
The matrix of coefficients can be saved using the COEFFICIENTS
option, and the sum of squares and products of the variates using the SSPM
option.
Options: PRINT
, INCHANNEL
, FORMAT
, ANALYSIS
, COEFFICIENT
, SSPM
.
Parameters: VARIATES
, FACTORS
.
Method
HANOVA
uses the method described by Gower (1962).
Action with RESTRICT
Account is taken of restriction on any factor, or on the first variate in the VARIATES
parameter: subsequent variates must either have the same restriction, or be unrestricted.
Reference
Gower, J.C. (1962). Variance component estimation for unbalanced hierarchical classifications. Biometrics, 18, 537-542.
See also
Directive: REML
.
Commands for: REML analysis of linear mixed models.
Example
CAPTION 'HANOVA example',\ !t('Analysis of variance and covariance of two',\ 'variables grouped in a four-stratum hierarchy.',\ 'Data from Gower, J.C. (1962, Biometrics 18, 537).');\ STYLE=meta,plain FACTOR [LEVELS=2; VALUES=1,1,1,1,1,1,1,1,2,2,2,2,2,2] f1 & [LEVELS=2; VALUES=1,1,1,1,2,2,2,2,1,1,1,2,2,2] f2 & [LEVELS=3; VALUES=1,1,2,3,1,2,2,3,1,2,3,1,1,1] f3 VARIATE [VALUES=3,4,6,3,3,2,1,2,8,12,11,14,12,15] v1 & [VALUES=5,24,7,25,57,42,18,14,-12,-34,-14.5,-42.2,-21.5,-12] v2 HANOVA !p(v1,v2); !p(f1,f2,f3)