Analyses non-standard generalized linear models (P.W. Lane).
Options
PRINT = string tokens |
What to display (deviance , estimates , correlations , monitoring ); default devi , esti |
---|---|
DISTRIBUTION = string token |
Distribution of response (Normal , Poisson , binomial , gamma , inversenormal ); default * indicates calculations supplied for non-standard distribution via procedure GLMDISTRIBUTION (see the details of the procedures called by GLM ) |
LINK = string token |
Link function (identity , logarithm , logit , reciprocal , power , squareroot , probit , complementaryloglog ); default * indicates calculations supplied for non-standard link via procedure GLMLINK (see Method) |
EXPONENT = scalar |
Exponent for power link; default -2 |
TERMS = list or formula |
Explanatory variates, factors, and interactions specified as for the standard regression directives; default null model |
CONSTANT = string token |
Whether to include constant term (estimate , omit ); default esti |
INITIALLINEAR = variate |
Initial guess at linear predictor, if specifying own link function and not defining procedure GLMINITIAL |
Parameters
Y = variates |
Response variate; this parameter must be set |
---|---|
NBINOMIAL = variates |
Totals for use when DISTRIBUTION=binomial ; must then be set |
FITTEDVALUES = variates |
To store correct fitted values |
Description
A range of standard generalized linear models can be fitted using the regression directives MODEL
, FIT
and so on. Procedure GLM
allows non-standard models to be fitted: you can choose to define your own link function, or the distribution of the response variable, or both. The standard links and distributions can be chosen by setting the options DISTRIBUTION
, LINK
and EXPONENT
as in the MODEL
directive; non-standard ones require the definition of auxiliary procedures to carry out the necessary calculations: see the details of the procedures called by GLM
. The terms in the fitted model are specified by the TERMS
option, which may be set to a list of terms or to a formula, as in the TERMS
directive, or may be left unset to fit a null model. The CONSTANT
option may be set to estimate or omit a constant term. The Y
parameter must be set to specify the response variate, and for a binomial distribution the NBINOMIAL
parameter must be set, as in the MODEL
directive.
The output from the procedure is controlled by the PRINT
option: by default, the residual deviance with d.f. and the parameter estimates with s.e.s are given. Standard errors are based on the residual mean square for all distributions: there is no SCALE
option like in the MODEL
directive. After using the procedure, the regression directives RDISPLAY
and RKEEP
may be used to display or save results, as for standard models fitted with the FIT
directive. However, some of the output will not be appropriate: the total deviance from the summary
setting will be incorrect, but the residual deviance should be correct; also, the response variate, fitted values and residuals will be incorrect in the output for the fittedvalues
setting and if RKEEP
is used to save results. The correct fitted values may be saved with parameter FITTEDVALUES
of GLM
.
Options: PRINT
, DISTRIBUTION
, LINK
, EXPONENT
, TERMS
, CONSTANT
, INITIALLINEAR
.
Parameters: Y
, NBINOMIAL
, FITTEDVALUES
.
Method
The model is fitted by iteratively-reweighted least-squares, as outlined by Nelder & Wedderburn (1972).
If a non-standard distribution is required, the option DISTRIBUTION
should be left unset and the GLMDISTRIBUTION
procedure defined, before using the GLM
procedure. The NBINOMIAL
parameter must be included in the definition, even if the NBINOMIAL
parameter of GLM
is not used.
GLMDISTRIBUTION Y=variate; FITTEDVALUES=variate;\
VARIANCE=variate; DEVIANCE=scalar; NBINOMIAL=variate
Forms the variance function and the deviance using the fitted values and the response variate.
If a non-standard link function is required, the option LINK
should be left unset and the procedure GLMLINK
defined, before using the GLM
procedure, to specify the necessary calculations for the link function. The NBINOMIAL
parameter must be included in the definition, even if the NBINOMIAL
parameter of GLM
is not used. In addition, either the GLMINITIAL
procedure must be defined, to specify calculations to form an initial guess at the linear predictor, or the INITIALLINEAR
option must be set to a variate that holds this initial guess.
GLMLINK LINEARPREDICTOR=variate; FITTEDVALUES=variate;\
DERIVATIVE=variate; NBINOMIAL=variate
Forms the fitted values and the derivative of the link function – the derivative of the linear predictor with respect to the fitted value – using the linear predictor.
GLMINITIAL Y=variate; LINEARPREDICTOR=variate; NBINOMIAL=variate
Forms initial values for the linear predictor using the response variate, adjusting if necessary to avoid values unsuitable for the link function.
Action with RESTRICT
Any restriction on a variate in the Y
parameter list is applied to all calculations. No vector in the TERMS
list or formula should be restricted, unless with the same restriction as for the Y
variate.
Reference
Nelder, J.A. & Wedderburn, R.W.M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A, 135, 370-384.
See also
Directive: MODEL
.
Commands for: Regression analysis.
Example
CAPTION 'GLM example',\ !t('The example estimates the toxicity of two derris roots to',\ 'grain beetles using a probit model with control mortality',\ '(data from Finney, 1971, Probit analysis, p. 132).');\ STYLE=meta,plain VARIATE [NVALUES=9] logconc,nspray,ndead READ logconc,nspray,ndead 2.17 142 142 2.00 127 126 1.68 128 115 1.08 126 58 1.79 125 125 1.66 117 115 1.49 127 114 1.17 51 40 0.57 132 37 : FACTOR [LEVELS=2; VALUES=4(1),5(2)] root CAPTION !t('In the control, 129 insects were sprayed with the medium used',\ 'in the spray, but with no derris; 21 died therefore assume 17%',\ 'control mortality. First fit by applying Abbott''s formula.') CALCULATE adjnspray,adjndead = 0.83 * nspray,ndead MODEL [DISTRIBUTION=binomial; LINK=probit] adjndead; NBINOMIAL=adjnspray TERMS logconc,root FIT logconc,root CAPTION 'Now define an exact link function.' PROCEDURE 'GLMLINK' PARAMETER 'LINEARPREDICTOR','FITTEDVALUES','DERIVATIVE','NBINOMIAL'; MODE=p SCALAR [VALUE=0.17] CONTROLM CALCULATE FITTEDVALUES =\ NBINOMIAL*(CONTROLM+(1-CONTROLM)*NORMAL(LINEARPREDICTOR)) & DERIVATIVE = SQRT(2*3.14159)*EXP(LINEARPREDICTOR**2/2)\ / NBINOMIAL / (1-CONTROLM) ENDPROC CAPTION 'Calculate initial linear predictor.' CALCULATE lp = NED((ndead+0.5)/(nspray+1)) GLM [DISTRIBUTION=binomial; TERMS=logconc,root; INITIALLINEAR=lp]\ Y=ndead; NBINOMIAL=nspray CAPTION !t('S.e.s for binomial should be derived from the inverse', 'matrix without multiplying by the root mean deviance.')\ RKEEP INVERSE=imat; ESTIMATES=estimate DIAGONAL [ROWS=3] dmat CALCULATE dmat = imat & dmat = SQRT(dmat) VARIATE [NVALUES=3] se EQUATE OLD=dmat; NEW=se PRINT estimate,se