Performs analysis of variance for unbalanced designs (R.W. Payne).

### Options

`PRINT` = string tokens |
Controls printed output from the analysis (`aovtable` , `effects` , `means` , `residuals` , `screen` , `%cv` ); default `aovt` , `mean` |
---|---|

`FACTORIAL` = scalar |
Limit on number of factors in a treatment term; default 3 |

`PFACTORIAL` = scalar |
Limit on number of factors in printed tables of predicted means; default 3 |

`NOMESSAGE` = string tokens |
Which warning messages to suppress (`dispersion` , `leverage` , `residual` , `aliasing` , `marginality` , `vertical` , `df` , `inflation` ); default `*` i.e. none |

`FPROBABILITY` = string token |
Printing of probabilities for variance ratios in the analysis-of-variance table (`yes` , `no` ); default `no` |

`TPROBABILITY` = string token |
Printing of probabilities for t-tests of effects (`yes` , `no` ); default `no` |

`PLOT` = string tokens |
Which residual plots to provide (`fittedvalues` , `normal` , `halfnormal` , `histogram` ); default `*` i.e. none |

`COMBINATIONS` = string token |
Factor combinations for which to form predicted means (`present` , `estimable` ); default `esti` |

`ADJUSTMENT` = string token |
Type of adjustment to be made when predicting means (`marginal` , `equal` , `observed` ); default `marg` |

`WEIGHTS` = variate |
Weights for each unit; default `*` i.e. all units with weight one |

`PSE` = string tokens |
Types of standard errors to be printed with the predicted means (`differences` , `alldifferences` , `lsd` , `alllsd` , `means` , `ese` ); default `diff` |

`LSDLEVEL` = scalar |
Significance level (%) for least significant differences; default 5 |

`RMETHOD` = string token |
Type of residuals to plot (`simple` , `standardized` ); default `simp` |

### Parameters

`Y` = variates |
Data values to be analysed |
---|---|

`RESIDUALS` = variates |
Variate to save the residuals from each analysis |

`FITTEDVALUES` = variates |
Variate to save the fitted values from each analysis |

`SAVE` = identifiers |
To save details of each analysis to use subsequently with the `AUDISPLAY` procedure |

### Description

This procedure carries out analysis of variance using the regression directives in Genstat. It is particularly useful for designs that are unbalanced and which thus cannot be analysed by the `ANOVA`

directive.

The method of use is similar to that for `ANOVA`

. The treatment terms to be fitted must be specified, before calling the procedure, by the `TREATMENTSTRUCTURE`

directive. Similarly, any covariates must be indicated by the `COVARIATE`

directive. The procedure also takes account of any blocking structure specified by the `BLOCKSTRUCTURE`

directive. However, it cannot produce stratified analyses like those generated by `ANOVA`

, and is able to estimate treatments and covariates only in the “bottom stratum”. So, for example, the full analysis can be produced for a randomized block design, where the treatments are all estimated on the plots within blocks, but it cannot produce the whole-plot analysis in a split plot design.

The parameters of the procedure are identical to those of `ANOVA`

. The variates to be analysed are specified by the `Y`

parameter. Residuals and fitted values can be saved using the `RESIDUALS`

and `FITTEDVALUES`

parameters respectively. Finally, the `SAVE`

parameter allows details of the analysis to be saved so that further output can be obtained using the `AUDISPLAY`

procedure, or information can be copied into Genstat data structures using the `AUKEEP`

procedure. (Note that this is a regression save structure, not an `ANOVA`

structure, so it cannot be used with the directives `ADISPLAY`

or `AKEEP`

.)

Printed output is controlled by the `PRINT`

option, with settings: `aovtable`

to print the analysis-of-variance table, `effects`

to print the effects (as estimated by Genstat regression), `means`

to print tables of predicted means with standard errors, `residuals`

to print residuals and fitted values, `screen`

to print “screening” tests for treatment terms, and `%cv`

to print the coefficient of variation. The default is to print the analysis-of-variance table and tables of means.

The model is fitted sequentially, first any block terms, then any covariates and then the treatments. Thus, the sum of square in each line of the analysis-of-variance table is for the term concerned, eliminating the effects of terms in earlier lines and ignoring the effects of terms lower in the table. In particular, the sums of squares for covariates are ignoring treatments, and not after eliminating treatments (as with the `ANOVA`

directive). Alternatively, the `screen`

setting calls the `RSCREEN`

procedure to provide screening tests for the treatment terms: marginal tests to assess the effect of adding each term to the simplest possible model (i.e. a model containing any blocks and covariates, and any terms marginal to the term); conditional tests to assess the effect of adding each term to the fullest possible model (i.e. a model containing all terms other than those to which the term is marginal). For example, if we have

`BLOCKSTRUCTURE Blocks`

and

`TREATMENTSTRUCTURE A + B + A.B`

the marginal test for `A`

will show the effect of adding `A`

to a model containing only `Blocks`

, while the conditional test will show the effect of adding `A`

to a model containing `Blocks`

and `B`

. (The terms `A`

and `B`

are marginal to `A.B`

.)

Tables of means are calculated using the `PREDICT`

directive. The first step (A) of the calculation forms the full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the `table of means. `

The `COMBINATIONS`

option specifies which cells of the full table are to be formed in Step A. The default setting, `estimable`

, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting `COMBINATIONS=present`

excludes the cells for factor combinations that do not occur in the data. The `ADJUSTMENT`

option then defines how the averaging is done in Step B. The default setting, `marginal`

, forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting `equal`

weights all the combinations equally. Finally, the setting `observed`

uses the `WEIGHTS`

option of `PREDICT`

to weight each factor combination according to its own individual replication in the data.

The `PSE`

option controls the types of standard errors that are produced to accompany the tables of means, with settings:

`differences` |
summary of standard errors for differences between pairs of means; |
---|---|

`alldifferences` |
standard errors for differences between all pairs of means; |

`lsd` |
summary of least significant differences between pairs of means; |

`alllsd` |
least significant differences between all pairs of means; |

`means` |
standard errors of the means (relevant for comparing them with zero); |

`ese` |
approximate effective standard errors – these are formed by procedure `SED2ESE` with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sed_{i}_{,j} = √( ese_{i}^{2} + ese_{j}^{2} ). |

The default is `differences`

. The `LSDLEVEL`

option sets the significance level (as a percentage) for the least significant differences.

The `FACTORIAL`

option sets a limit on the number of factors that a higher-order term, such as an interaction, can contain; any terms with more factors are deleted from the analysis. Similarly, the `PFACTORIAL`

option limits the number of factors in terms for which predicted means are printed. Probabilities can be printed for variance ratios by setting option `FPROBABILITY=yes`

, and probabilities for t-tests of effects by setting option `TPROBABILITY=yes`

. The `WEIGHTS`

option allows a variate of weights to be specified for a weighted analysis of variance. The `NOMESSAGE`

option allows various warning messages (produced by the `FIT`

directive) to be suppressed, and the `PLOT`

option allows various residual plots to be requested: `fittedvalues`

for a plot of residuals against fitted values, `normal`

for a Normal plot, `halfnormal`

for a half Normal plot, and `histogram`

for a histogram of residuals. By default, simple residuals are plotted, but you can set option `RMETHOD=standardized`

to plot standardized residuals instead.

Options: `PRINT`

, `FACTORIAL`

, `PFACTORIAL`

, `NOMESSAGE`

, `FPROBABILITY`

, `TPROBABILITY`

, `PLOT`

, `COMBINATIONS`

, `ADJUSTMENT`

, `PSE`

, `WEIGHTS`

, `LSDLEVEL`

, `RMETHOD`

.

Parameters: `Y`

, `RESIDUALS`

, `FITTEDVALUES`

, `SAVE`

.

### Method

The y-variate is specified using the `MODEL`

directive, along with any variates to save residuals and fitted values. The current settings of the `TREATMENTSTRUCTURE`

and `COVARIATE`

directives are recovered using the `SET`

directive, and used to define the terms in the analysis (using the `TERMS`

directive). The model is then fitted (using `FIT`

), `AUDISPLAY`

is called to print the output and any plots of residuals.

### Action with `RESTRICT`

If the `Y`

variate is restricted, only the units not excluded by the restriction will be analysed.

### See also

Procedures: `AUDISPLAY`

, `AUGRAPH`

, `AUPREDICT`

, `AUMCOMPARISON`

, `AUKEEP`

.

Commands for: Analysis of variance.

### Example

CAPTION 'AUNBALANCED example',\ 'Data from Genstat 5 Release 1 Reference Manual, page 340.';\ STYLE=meta,plain FACTOR [NVALUES=36; LEVELS=3; VALUES=12(1...3)] Block FACTOR [NVALUES=36; LABELS=!t(baresoil,emerald,emergo)] Leachate & [LABELS=!t('1','1/4','1/16','1/64')] Dilution VARIATE [NVALUES=36] Nhatch,Nnohatch READ Leachate,Dilution,Nhatch,Nnohatch 1 2 109 318 3 4 54 350 3 1 * 415 2 2 783 212 3 3 652 1375 2 4 490 816 1 3 95 1219 2 1 1012 66 1 4 166 943 3 2 1059 313 1 1 257 1006 2 3 1058 234 2 4 507 1119 1 2 194 840 1 3 175 1707 1 1 326 609 3 4 142 980 2 3 286 230 3 2 546 313 2 2 * 301 2 1 2471 112 3 3 76 489 1 4 208 503 3 1 * 325 1 1 322 913 1 2 255 2246 3 2 1774 1446 2 2 999 193 2 4 388 1836 3 4 221 1800 1 3 220 1902 2 1 2821 187 3 1 1486 463 3 3 717 1473 1 4 143 941 2 3 968 550 : CALCULATE Logit%h = LOG(Nhatch/Nnohatch) BLOCKSTRUCTURE Block TREATMENTSTRUCTURE Leachate*Dilution AUNBALANCED [PSE=differences,alldifferences] Logit%h