Forms predictions from an unbalanced analysis of variance, performed by `AUNBALANCED`

(R.W. Payne).

### Options

`PRINT` = string tokens |
What to print (`description` , `predictions` , `se` , `sed` , `sedsummary` , `ese` , `lsd` , `lsdsummary` , `vcovariance` ); default `pred` , `sed` |
---|---|

`MODEL` = formula |
Model to use to calculate the predictions; default * i.e. full model fitted by `AUNBALANCED` |

`FACTORIAL` = scalar |
Limit on number of factors or variates in each term specified by `MODEL` ; default 3 |

`COMBINATIONS` = string token |
Factor combinations for which to form predicted means (`present` , `estimable` ); default `esti` |

`ADJUSTMENT` = string token |
Type of adjustment to be made when predicting means (`marginal` , `equal` , `observed` ); default `marg` |

`WEIGHTS` = table |
Weights classified by some or all of the factors in the model |

`PREDICTIONS` = tables or scalars |
Saves predictions; default `*` |

`SE` = tables or scalars |
Saves standard errors of predictions; default `*` |

`SED` = symmetric matrices |
Saves matrices of standard errors of differences between predictions; default `*` |

`ESE` = table |
Saves effective standard errors; default `*` |

`LSD` = symmetric matrix |
Saves least significant differences between predictions; default `*` |

`LSDLEVEL` = scalar |
Significance level (%) for least significant differences; default 5 |

`VCOVARIANCE` = symmetric matrices |
Saves variance-covariance matrices of predictions; default `*` |

`SAVE` = identifier |
Save structure (from `AUNBALANCED` ) containing details of the analysis for which predictions are required; if omitted, output is from the most recent use of `AUNBALANCED` |

### Parameters

`CLASSIFY` = vectors |
Variates and/or factors to classify table of predictions |
---|---|

`LEVELS` = variates or scalars |
To specify values of variates, levels of factors |

### Description

`AUPREDICT`

can produce predicted means following an analysis of variance of an unbalanced design by `AUNBALANCED`

. The predictions are calculated using the `PREDICT`

directive. The first step (A) of the calculation forms a full table of predictions, classified by every factor in the model. The second step (B) averages the full table over the factors that do not occur in the `table of means. `

The `COMBINATIONS`

option specifies which cells of the full table are to be formed in Step A. The default setting, `estimable`

, fills in all the cells other than those that involve parameters that cannot be estimated, for example because of aliasing. Alternatively, setting `COMBINATIONS=present`

excludes the cells for factor combinations that do not occur in the data. The `ADJUSTMENT`

and `WEIGHTS`

options then define how the averaging is done in Step B. The `WEIGHTS`

option allows you to specify your own table of weights to use in the averaging. Alternatively, if `WEIGHTS`

is not set, the weights are formed automatically according to the setting of the `ADJUSTMENT`

option. The default setting, `marginal`

, of `ADJUSTMENT`

forms a table of marginal weights for each factor, containing the proportion of observations with each of its levels; the full table of weights is then formed from the product of the marginal tables. The setting `equal`

weights all the combinations equally. Finally, the setting `observed`

uses the `WEIGHTS`

option of `PREDICT`

to weight each factor combination according to its own individual replication in the data.

Printed output, which extends the output available from `PREDICT`

, is controlled by settings of the `PRINT`

option:

`description` |
standardization policies used when forming the predictions, |
---|---|

`predictions` |
predictions, |

`se` |
predictions and standard errors, |

`sed` |
standard errors for differences between the predictions, |

`sedsummary` |
summary of the standard errors for differences between the predictions, |

`lsd` |
least significant differences between the predictions, |

`lsdsummary` |
summary of the least significant differences between the predictions, |

`ese` |
approximate effective standard errors – these are formed by procedure `SED2ESE` with the aim of allowing good approximations to the standard errors for differences to be calculated by the usual formula of sed_{i}_{,j} = √( ese_{i}^{2} + ese_{j}^{2} ), and |

`vcovariance` |
variance and covariances of the predictions. |

The default is to print predictions and a summary of the standard errors of differences. The standard errors (and sed’s) are relevant for the predictions when considered as means of those data that have been analysed, with the means formed according to the averaging policy defined by the options of `PREDICT`

. The word *prediction* is used because these are predictions of what the means would have been if the factor levels been replicated differently in the data; see Lane & Nelder (1982) for more details. The `LSDLEVEL`

option specifies the significance level (%) to use in the calculation of least significant differences (default 5%).

Another extension in `AUPREDICT`

is that you can produce predictions using a smaller model than the full model that has been fitted by `AUNBALANCED`

. This can be useful if the full model contains many parameters. A substantial amount of time and computer workspace may then be needed to calculate the predictions and standard errors. Very large models may even exceed the capacity of some PCs.

You might choose to omit a term from the full model when forming a particular table of predictions if the term is orthogonal to all the terms involved in the table. For example, you might omit the term `blocks`

when forming an `A`

-by-`B`

table of predictions if each combination of levels of the factors `A`

and `B`

is replicated the same number of times in every block. The justification is that an orthogonal term cannot affect the size of any of the differences between predictions. Different weighting of the levels of the orthogonal term may affect the overall mean of the predictions, but this is usually unimportant. If you omit the term, it is though you had included it with weightings based on the observed replication of its levels in the data set – and in any well-designed data set these should provide a satisfactory outcome. You might also omit a term if it is nearly orthogonal to the terms involved in the table, and you are happy to ignore its effect on the predictions.

The model is specified by the `MODEL`

option. The `FACTORIAL`

option sets a limit on number of factors or variates in each term specified by `MODEL`

; default 3.

The `PREDICTIONS`

, `SE`

, `SED`

, `ESE`

, `LSD`

and `VCOVARIANCE`

options allow the results of the prediction to be save in appropriate Genstat data structures.

The `SAVE`

option allows you to specify save structure from the analysis for which further output is required. If `SAVE`

is not set, output will be produced for the most recent analysis from `AUNBALANCED`

; however, none of the Genstat regression directives (`MODEL`

, `TERMS`

, `FIT`

, `ADD`

, `DROP`

and so on) must then have been used in the interim.

Options: `PRINT`

, `MODEL`

, `FACTORIAL`

, `COMBINATIONS`

, `ADJUSTMENT`

, `WEIGHTS`

, `PREDICTIONS`

, `SE`

, `SED`

, `ESE`

, `LSD`

, `LSDLEVEL`

, `VCOVARIANCE`

, `SAVE`

.

Parameters: `CLASSIFY`

, `LEVELS`

.

### Method

The predictions are produced using the `PREDICT`

directive.

### Reference

Lane, P.W. & Nelder, J.A. (1982). Analysis of covariance and standardization as instances of prediction. *Biometrics*, 38, 613-621.

### See also

Directive: `PREDICT`

.

Procedures: `AUNBALANCED`

, `AUDISPLAY`

, `AUGRAPH`

, `AUMCOMPARISON`

, `AUKEEP`

.

Commands for: Analysis of variance.

### Example

CAPTION 'AUPREDICT example',\ 'Data from Genstat 5 Release 1 Reference Manual, page 340.';\ STYLE=meta,plain FACTOR [NVALUES=36; LEVELS=3; VALUES=12(1...3)] Block FACTOR [NVALUES=36; LABELS=!t(baresoil,emerald,emergo)] Leachate & [LABELS=!t('1','1/4','1/16','1/64')] Dilution VARIATE [NVALUES=36] Nhatch,Nnohatch READ Leachate,Dilution,Nhatch,Nnohatch 1 2 109 318 3 4 54 350 3 1 * 415 2 2 783 212 3 3 652 1375 2 4 490 816 1 3 95 1219 2 1 1012 66 1 4 166 943 3 2 1059 313 1 1 257 1006 2 3 1058 234 2 4 507 1119 1 2 194 840 1 3 175 1707 1 1 326 609 3 4 142 980 2 3 286 230 3 2 546 313 2 2 * 301 2 1 2471 112 3 3 76 489 1 4 208 503 3 1 * 325 1 1 322 913 1 2 255 2246 3 2 1774 1446 2 2 999 193 2 4 388 1836 3 4 221 1800 1 3 220 1902 2 1 2821 187 3 1 1486 463 3 3 717 1473 1 4 143 941 2 3 968 550 : CALCULATE Logit%h = LOG(Nhatch/Nnohatch) BLOCKSTRUCTURE Block TREATMENTSTRUCTURE Leachate*Dilution AUNBALANCED [PRINT=aovtable] Logit%h AUPREDICT Leachate & Dilution & Leachate,Dilution