FITINDIVIDUALLY procedure

Fits regression models one term at a time (R.W. Payne).

Options

`PRINT` = string tokens	What to print (`model`, `deviance`, `summary`, `estimates`, `correlations`, `fittedvalues`, `accumulated`, `monitoring`, `confidence`); default `mode`, `summ`, `esti`
`CONSTANT` = string token	How to treat the constant (`estimate`, `omit`); default `esti`
`FACTORIAL` = scalar	Limit for expansion of model terms; default 3
`POOL` = string token	Whether to pool ss in accumulated summary between all terms fitted in a linear model (`yes`, `no`); default `no`
`DENOMINATOR` = string token	Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (`ss`, `ms`); default `ss`
`NOMESSAGE` = string tokens	Which warning messages to suppress (`dispersion`, `leverage`, `residual`, `aliasing`, `marginality`, `vertical`, `df`, `inflation`); default `*`
`FPROBABILITY` = string token	Printing of probabilities for variance and deviance ratios (`yes`, `no`); default `no`
`TPROBABILITY` = string token	Printing of probabilities for t-statistics (`yes`, `no`); default `no`
`SELECTION` = string tokens	Statistics to be displayed in the summary of analysis produced by `PRINT=summary`, `seobservations` is relevant only for a Normally distributed response, and `%cv` only for a gamma-distributed response (`%variance`, `%ss`, `adjustedr2`, `r2`, `seobservations`, `dispersion`, `%cv`, `%meandeviance`, `%deviance`, `aic`, `bic`, `sic`); default `%var`, `seob` if `DIST=normal`, `%cv` if `DIST=gamma`, and `disp` for other distributions
`PROBABILITY` = scalar	Probability level for confidence intervals for parameter estimates; default 0.95
`DEVIANCE` = scalar	Saves the residual deviance
`DF` = scalar	Saves the residual d.f.
`LACKOFFIT` = string token	Whether to use observations with replicated values of the explanatory variables to split the final residual term into a ‘true’ residual and lack of fit (`estimate`, `omit`); default `omit`

Parameter

`TERMS` = formula	Terms to be fitted

Description

FITINDIVIDUALLY is provided as an alternative to the FIT directive for use, in particular, with generalized linear models. With these models, for efficiency, the entire model is fitted at once rather than one term at a time as in ordinary regression models. As a result the terms of the generalized linear model are pooled into a single line in the analysis of deviance table. However, if you want to see the contributions of the individual terms in the analysis of deviance table, you can use FITINDIVIDUALLY instead of FIT.

FITINDIVIDUALLY is used exactly like FIT. It must be preceded by a MODEL statement, and can be followed by RCHECK, RDISPLAY, RGRAPH, RKEEP, ADD, DROP, SWITCH and so on. It has a TERMS parameter to specify the terms to be fitted, like the parameter of FIT. It also has options PRINT, CONSTANT, FACTORIAL, POOL, DENOMINATOR, NOMESSAGE, FPROBABILITY, TPROBABILITY, SELECTION and PROBABILITY which operate like those of FIT.

If you have observations with replicated values of the explanatory variables, you can set option LACKOFFIT=estimate to split the final residual term into a “true” residual (measured by the variation amongst the replicate observations) and lack of fit. FITINDIVIDUALLY then sets the dispersion parameter and its number of degrees of freedom in the regression save structure to the “true” residual deviance and its degrees of freedom, so that these will be used for standard errors and probabilities etc. in future output. (These are the aspects that you can set using the DISPERSION and DFDISPERSION options of MODEL.) The DEVIANCE option allows you to save the residual deviance, and the DF option saves the residual number of degrees of freedom.

Options: PRINT, CONSTANT, FACTORIAL, POOL, DENOMINATOR, NOMESSAGE, FPROBABILITY, TPROBABILITY, SELECTION, PROBABILITY, DEVIANCE, DF, LACKOFFIT.

Parameter: TERMS.

Method

FITINDIVIDUALLY uses FCLASSIFICATION to break the TERMS formula up into individual terms. It fits these individually using ADD, and then calls RDISPLAY to display the output. It uses procedure FACCOMBINATIONS to identify the observations with replicated values of the explanatory variables so that it can calculate the lack of fit. It calls an auxiliary procedure _FITIRSET for setting the dispersion parameter and its number of degrees of freedom in the regression save structure (this uses inside knowledge of the structure of the structure).

Action with `RESTRICT`

As in FIT, the y-variate (specified in an earlier MODEL directive) can be restricted to analyse a subset of the data.

Example

CAPTION   'FITINDIVIDUALLY example',\
          !t('Analysis of the damage caused by waves to forward sections',\
          'of cargo-carrying ships (McCullagh & Nelder 1989, page 204).');\
          STYLE=meta,plain
FACTOR    [NVALUES=40; LABELS=!T(A,B,C,D,E)] Type
&         [LABELS=!T('1960-64','1965-69','1970-74','1975-79')] Construction
&         [LABELS=!T('1960-74','1975-79')] Operation
GENERATE  Type,Construction,Operation
VARIATE   [NVALUES=40] Service,Damage
READ      Service,Damage
  127  0     63  0   1095  3   1095  4  1512  6   3353 18  * *  2244 11
44882 39  17176 29  28609 58  20370 53  7064 12  13099 44  * *  7117 18
 1179  1    552  1    781  0    676  1   783  6   1948  2  * *   274  1
  251  0    105  0    288  0    192  0   349  2   1208 11  * *  2051  4
   45  0      0  0    789  7    437  7  1157  5   2161 12  * *   542  1 :
" Use the log of the number of months of service as an offset in the
  model; CALCULATE turns zeroes into missing values, which will then
  be excluded by TERMS as required for a correct analysis."
CALCULATE Logservice = LOG(Service)
MODEL     [DISTRIBUTION=poisson; LINK=log; OFFSET=Logservice] Damage
TERMS     [FACTORIAL=2] Type * Construction * Operation
" Fit the main effects one at a time."
FITINDIVIDUALLY [PRINT=accumulated,estimates] Type + Construction + Operation

Updated on June 19, 2019

Tagged: Command Procedures

Was this article helpful?

Yes No