Analyses non-standard generalized linear models (P.W. Lane).
Options
PRINT = string tokens |
What to display (deviance, estimates, correlations, monitoring); default devi, esti |
|---|---|
DISTRIBUTION = string token |
Distribution of response (Normal, Poisson, binomial, gamma, inversenormal); default * indicates calculations supplied for non-standard distribution via procedure GLMDISTRIBUTION (see the details of the procedures called by GLM) |
LINK = string token |
Link function (identity, logarithm, logit, reciprocal, power, squareroot, probit, complementaryloglog); default * indicates calculations supplied for non-standard link via procedure GLMLINK (see Method) |
EXPONENT = scalar |
Exponent for power link; default -2 |
TERMS = list or formula |
Explanatory variates, factors, and interactions specified as for the standard regression directives; default null model |
CONSTANT = string token |
Whether to include constant term (estimate, omit); default esti |
INITIALLINEAR = variate |
Initial guess at linear predictor, if specifying own link function and not defining procedure GLMINITIAL |
Parameters
Y = variates |
Response variate; this parameter must be set |
|---|---|
NBINOMIAL = variates |
Totals for use when DISTRIBUTION=binomial; must then be set |
FITTEDVALUES = variates |
To store correct fitted values |
Description
A range of standard generalized linear models can be fitted using the regression directives MODEL, FIT and so on. Procedure GLM allows non-standard models to be fitted: you can choose to define your own link function, or the distribution of the response variable, or both. The standard links and distributions can be chosen by setting the options DISTRIBUTION, LINK and EXPONENT as in the MODEL directive; non-standard ones require the definition of auxiliary procedures to carry out the necessary calculations: see the details of the procedures called by GLM. The terms in the fitted model are specified by the TERMS option, which may be set to a list of terms or to a formula, as in the TERMS directive, or may be left unset to fit a null model. The CONSTANT option may be set to estimate or omit a constant term. The Y parameter must be set to specify the response variate, and for a binomial distribution the NBINOMIAL parameter must be set, as in the MODEL directive.
The output from the procedure is controlled by the PRINT option: by default, the residual deviance with d.f. and the parameter estimates with s.e.s are given. Standard errors are based on the residual mean square for all distributions: there is no SCALE option like in the MODEL directive. After using the procedure, the regression directives RDISPLAY and RKEEP may be used to display or save results, as for standard models fitted with the FIT directive. However, some of the output will not be appropriate: the total deviance from the summary setting will be incorrect, but the residual deviance should be correct; also, the response variate, fitted values and residuals will be incorrect in the output for the fittedvalues setting and if RKEEP is used to save results. The correct fitted values may be saved with parameter FITTEDVALUES of GLM.
Options: PRINT, DISTRIBUTION, LINK, EXPONENT, TERMS, CONSTANT, INITIALLINEAR.
Parameters: Y, NBINOMIAL, FITTEDVALUES.
Method
The model is fitted by iteratively-reweighted least-squares, as outlined by Nelder & Wedderburn (1972).
If a non-standard distribution is required, the option DISTRIBUTION should be left unset and the GLMDISTRIBUTION procedure defined, before using the GLM procedure. The NBINOMIAL parameter must be included in the definition, even if the NBINOMIAL parameter of GLM is not used.
GLMDISTRIBUTION Y=variate; FITTEDVALUES=variate;\
VARIANCE=variate; DEVIANCE=scalar; NBINOMIAL=variate
Forms the variance function and the deviance using the fitted values and the response variate.
If a non-standard link function is required, the option LINK should be left unset and the procedure GLMLINK defined, before using the GLM procedure, to specify the necessary calculations for the link function. The NBINOMIAL parameter must be included in the definition, even if the NBINOMIAL parameter of GLM is not used. In addition, either the GLMINITIAL procedure must be defined, to specify calculations to form an initial guess at the linear predictor, or the INITIALLINEAR option must be set to a variate that holds this initial guess.
GLMLINK LINEARPREDICTOR=variate; FITTEDVALUES=variate;\
DERIVATIVE=variate; NBINOMIAL=variate
Forms the fitted values and the derivative of the link function – the derivative of the linear predictor with respect to the fitted value – using the linear predictor.
GLMINITIAL Y=variate; LINEARPREDICTOR=variate; NBINOMIAL=variate
Forms initial values for the linear predictor using the response variate, adjusting if necessary to avoid values unsuitable for the link function.
Action with RESTRICT
Any restriction on a variate in the Y parameter list is applied to all calculations. No vector in the TERMS list or formula should be restricted, unless with the same restriction as for the Y variate.
Reference
Nelder, J.A. & Wedderburn, R.W.M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A, 135, 370-384.
See also
Directive: MODEL.
Commands for: Regression analysis.
Example
CAPTION 'GLM example',\
!t('The example estimates the toxicity of two derris roots to',\
'grain beetles using a probit model with control mortality',\
'(data from Finney, 1971, Probit analysis, p. 132).');\
STYLE=meta,plain
VARIATE [NVALUES=9] logconc,nspray,ndead
READ logconc,nspray,ndead
2.17 142 142 2.00 127 126 1.68 128 115 1.08 126 58 1.79 125 125
1.66 117 115 1.49 127 114 1.17 51 40 0.57 132 37 :
FACTOR [LEVELS=2; VALUES=4(1),5(2)] root
CAPTION !t('In the control, 129 insects were sprayed with the medium used',\
'in the spray, but with no derris; 21 died therefore assume 17%',\
'control mortality. First fit by applying Abbott''s formula.')
CALCULATE adjnspray,adjndead = 0.83 * nspray,ndead
MODEL [DISTRIBUTION=binomial; LINK=probit] adjndead; NBINOMIAL=adjnspray
TERMS logconc,root
FIT logconc,root
CAPTION 'Now define an exact link function.'
PROCEDURE 'GLMLINK'
PARAMETER 'LINEARPREDICTOR','FITTEDVALUES','DERIVATIVE','NBINOMIAL'; MODE=p
SCALAR [VALUE=0.17] CONTROLM
CALCULATE FITTEDVALUES =\
NBINOMIAL*(CONTROLM+(1-CONTROLM)*NORMAL(LINEARPREDICTOR))
& DERIVATIVE = SQRT(2*3.14159)*EXP(LINEARPREDICTOR**2/2)\
/ NBINOMIAL / (1-CONTROLM)
ENDPROC
CAPTION 'Calculate initial linear predictor.'
CALCULATE lp = NED((ndead+0.5)/(nspray+1))
GLM [DISTRIBUTION=binomial; TERMS=logconc,root; INITIALLINEAR=lp]\
Y=ndead; NBINOMIAL=nspray
CAPTION !t('S.e.s for binomial should be derived from the inverse',
'matrix without multiplying by the root mean deviance.')\
RKEEP INVERSE=imat; ESTIMATES=estimate
DIAGONAL [ROWS=3] dmat
CALCULATE dmat = imat
& dmat = SQRT(dmat)
VARIATE [NVALUES=3] se
EQUATE OLD=dmat; NEW=se
PRINT estimate,se