Calculates Wald and F tests for dropping terms from a regression (R.W. Payne).

### Options

`PRINT` = string token |
Controls printed output (`waldtests` ); default `wald` |
---|---|

`FACTORIAL` = scalar |
Limit on number of factors in the model terms generated from the `TERMS` parameter; default 3 |

`Y` = variate |
Y-variate from whose analysis to calculate the statistics; default is the last y-variate in `SAVE` |

`RDF` = scalar |
Saves the residual d.f. used to calculate F probabilities when the dispersion is not fixed |

`SAVE` = regression save structure |
Specifies the save structure (from `MODEL` ) containing the analysis for which to calculate the tests; default is the save structure from the most recent regression |

### Parameters

`TERMS` = formula |
Model terms for which tests are required |
---|---|

`WALDSTATISTIC` = scalar or pointer to scalars |
Saves Wald statistics |

`DF` = scalar or pointer to scalars |
Saves d.f. of Wald statistics |

`PROBABILITY` = scalar or pointer to scalars |
Saves the probabilities for the Wald statistics if the dispersion is fixed, or the corresponding F statistics if it is estimated |

### Description

`RWALD`

provides Wald tests to help you decide whether any terms can be dropped from a regression model. The model must have been fitted already by the regression commands (`MODEL`

, `FIT`

etc.) in the usual way. The tests are usually produced for the most recent regression analysis, but you can set the `SAVE`

and `Y`

options to request tests from an earlier analysis.

By default, `RWALD`

produces tests for all the terms that can be dropped from the model: that is, for every term that is not marginal to another term in the model. For example, in the formula

`A + B + C + D + A.B + A.D + B.D`

the terms `C`

, `A.B`

, `A.D`

and `B.D`

can be dropped as there are no other terms in the model that contain all their factors (i.e. none to which they are marginal). However, `A`

cannot be dropped until `A.B`

and `A.D`

have been dropped. You can use the `TERMS`

parameter to request Wald tests for a specific set of terms. A missing value is then given for any term that cannot be dropped. The `FACTORIAL`

option sets a limit on the number of factors or variates in each term that is formed from the `TERMS`

formula (default 3).

If option `PRINT=waldtests`

(the default), `RWALD`

prints a table with columns containing the Wald statistic, its number of degrees of freedom and a probability value. With an ordinary linear regression, `RWALD`

will also print an F statistic, and use this to obtain the probability. Provided there is no aliasing between the parameters of the terms, these F statistics and probabilities will be identical to those that would be printed in the Change lines of the Summary of Analysis if the terms were dropped from the model explicitly by using the `DROP`

or `TRY`

directives. The advantage of `RWALD`

is that the model does not have to be refitted (excluding each term) to calculate the information. It thus provides a much more efficient method of assessing the model.

F statistics are also given with any generalized linear model in which the dispersion is not fixed (e.g. models involving the gamma distribution). However, in generalized linear models with a fixed dispersion (e.g. binomial or Poisson), the probabilities are obtained by treating the Wald statistics as chi-square statistics. The deviances and deviance ratios used by `TRY`

and `DROP`

are calculated from the likelihoods of the generalized linear models, whereas the Wald and F statistics are essentially based on weighted sums of squares. So probabilities calculated by `RWALD`

will no longer be identical to those given by `TRY`

and `DROP`

. However, both sets of probabilities are based on the asymptotic properties of their statistics, and so they should give similar conclusions.

The `WALDSTATISTIC`

parameter can save the statistics, and the `DF`

parameter can save their numbers of degrees of freedom. If you are making a Wald test for a single term, you can supply a scalar for each of these parameters. However, if you have several terms, you must supply a pointer which will then be set up to contain as many scalars as there are terms. Similarly the `PROBABILITY`

parameter saves the probabilities for the Wald statistics if the dispersion is fixed, or the corresponding F statistics if it is estimated. The number residual degrees of freedom for the F statistics can be saved, in a scalar, by the `RDF`

option. This contains a missing value if the dispersion is fixed.

Options: `PRINT`

, `FACTORIAL`

, `Y`

, `RDF`

, `SAVE`

.

Parameters: `TERMS`

, `WALDSTATISTIC`

, `DF`

, `PROBABILITY`

.

### Method

`RWALD`

uses `FCLASSIFICATION`

to form the list of terms that can be dropped. It then calculates the statistics using estimates and variances saved using `RKESTIMATES`

.

### See also

Commands for: Regression analysis.

### Example

CAPTION 'RWALD example',\ 'Cloud seeding example; see Guide to Genstat Part 2, Section 3.3.';\ STYLE=meta,plain " Variables are: A Action (NS not seeded, S seeded) D Days after first day of experiment S Suitability for seeding (from model) C Percent cloud cover P Previous rainfall (in 10**7 cubic m) E Type of cloud (1 or 2) Y Subsequent rainfall (in 10**7 cubic m)" FACTOR [LABELS=!t(S,NS)] A FACTOR [LEVELS=2] E READ A,D,S,C,P,E,Y; FREPRESENTATION=labels,4(*),levels,* NS 0 1.75 13.4 0.274 2 12.85 S 1 2.70 37.9 1.267 1 5.52 S 3 4.10 3.9 0.198 2 6.29 NS 4 2.35 5.3 0.526 1 6.11 S 6 4.25 7.1 0.250 1 2.45 NS 9 1.60 6.9 0.018 2 3.61 NS 18 1.30 4.6 0.307 1 0.47 NS 25 3.35 4.9 0.194 1 4.56 NS 27 2.85 12.1 0.751 1 6.35 S 28 2.20 5.2 0.084 1 5.06 S 29 4.40 4.1 0.236 1 2.76 S 32 3.10 2.8 0.214 1 4.05 NS 33 3.95 6.8 0.796 1 5.74 S 35 2.90 3.0 0.124 1 4.84 S 38 2.05 7.0 0.144 1 11.86 NS 39 4.00 11.3 0.398 1 4.45 NS 53 3.35 4.2 0.237 2 3.66 S 55 3.70 3.3 0.960 1 4.22 NS 56 3.80 2.2 0.230 1 1.16 S 59 3.40 6.5 0.142 2 5.45 S 65 3.15 3.1 0.073 1 2.02 NS 68 3.15 2.6 0.136 1 0.82 S 82 4.01 8.3 0.123 1 1.09 NS 83 4.65 7.4 0.168 1 0.28 : CALCULATE Lp,Ly = LOG10(P,Y) MODEL Ly TERMS A*(D+S+C+Lp+E) FIT [PRINT=model,estimates] A + S + D + C + Lp + E + S.A RWALD TRY [PRINT=model,summary; NOMESSAGE=residual,leverage; FPROB=yes]\ D + C + Lp + E + S.A