Fits a generalized linear model with nonnegativity constraints; synonym FITNONNEGATIVE
(P.W. Goedhart & C.J.F. ter Braak).
Options
PRINT = string tokens |
Printed output required (model , deviance , summary , estimates , correlations , fittedvalues , accumulated , monitoring ); default mode , summ , esti |
---|---|
CONSTANT = string token |
How to treat the constant (estimate , omit ); default esti |
POOL = string token |
Whether to pool ss in accumulated summary between all terms fitted in a linear model (yes , no ); default no |
DENOMINATOR = string token |
Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (ss , ms ); default ss |
NOMESSAGE = string tokens |
Which warning messages to suppress (dispersion , leverage , residual , aliasing , marginality ); default * |
FPROBABILITY = string token |
Printing of probabilities for variance ratios (yes , no ); default no |
TPROBABILITY = string token |
Printing of probabilities for t-statistics (yes , no ); default no |
MAXCYCLE = scalar |
Maximum number of iterations; default 100 |
TOLERANCE = scalar |
Value against which the Kuhn-Tucker values are tested; default 10-8 |
INITIALMODEL = string token |
Initial model from which to start the iterative procedure (null , full , positive , own ); default null |
OWNINITIAL = variates |
Specifies the variates that compose your own initial model; this option must be set when INITIALMODEL=own ; default * |
FORCED = formula |
Model formula which is fitted irrespective of nonnegativity constraints; default * |
Parameter
X = variates |
List of predictors which are subject to nonnegativity constraints |
---|
Description
It is sometimes useful to impose nonnegativity constraints on regression coefficients. For example, the fitting of monotone regression splines (Ramsay 1988) requires nonnegative regression coefficients. Another example is regression of spectral data to determine the amounts of substances in a mixture. If an additive model holds, with the absorbance profiles as regressors, the amounts are estimated by the regression coefficients which should therefore be nonnegative. Note that an ordinary regression problem with general linear inequality constraints may be solved by using the solution to a derived regression problem with nonnegativity restrictions (Kennedy & Gentle 1980).
A call to RNONNEGATIVE
must be preceded by a MODEL
statement which defines the response variate and, if required, all other aspects of a generalized linear model. Only the first response variate is analysed. The only parameter, X
, must be set to a list of explanatory variates which are subject to the nonnegativity constraints. The predictors with nonnegative coefficients are found by an iterative procedure which is explained in the method section. RDISPLAY
and RKEEP
can be used subsequent to RNONNEGATIVE
.
Options PRINT
, CONSTANT
, POOL
, DENOMINATOR
, NOMESSAGE
, FPROBABILITY
and TPROBABILITY
are similar to the options of the FIT
directive. Setting PRINT=monitoring
provides monitoring of the iterative procedure. The MAXCYCLE
option can be used to specify the maximum number of iterations. If the iterative procedure has not converged within the maximum number of iterations, a warning message is printed. The INITIALMODEL
option provides different starting points for the iterative procedure. Setting null
starts with no predictors in the initial model, full
starts with all predictors, while the positive
setting starts with those predictors that have a strictly positive regression coefficient in the full model. Finally, INITIALMODEL=own
enables you to specify your own starting point. Option OWNINITIAL
must then be set to a subset of predictors listed by the X
parameter. Aliased terms, if any, are dropped after fitting the initial model. The use of the TOLERANCE
option is explained in the method section.
It is sometimes desirable to include some predictors irrespective of the sign of their regression coefficient. Such predictors may be specified by means of the FORCED
option. FORCED
can be set to any model formula, i.e. it may contain factors and interactions as well as variates. The FORCED
model formula is fitted first.
Units with one or more missing values in any term of the FORCED
formula or the X
predictors are excluded from the analysis. This implies that FIT
used for a subset of predictors may give different results than RNONNEGATIVE
.
Options: PRINT
, CONSTANT
, POOL
, DENOMINATOR
, NOMESSAGE
, FPROBABILITY
, TPROBABILITY
, MAXCYCLE
, TOLERANCE
, INITIALMODEL
, OWNINITIAL
, FORCED
.
Parameter: X
.
Method
For ordinary regression problems, the problem is to find the linear least squares solution subject to nonnegativity constraints, i.e.
min b ║y – X b║ subject to b ≥ 0
The Kuhn-Tucker conditions (Kennedy & Gentle 1980) are necessary and sufficient for finding the regression model with minimal sums of squares. These conditions are
KT1j = [XT (y – X b)] j = 0 if bj > 0
KT2j = [XT (y – X b)] j ≤ 0 if bj = 0
These conditions also hold when only a subset of regression coefficients are subject to the nonnegativity constraint. In weighted regression, with diagonal matrix of weights W, the Kuhn-Tucker values are given by [XT W (y – X b)].
Lawson & Hanson (1974) use these conditions in an algorithm which begins with b = 0. Next, bj is allowed to enter the model where j is selected as the index of the maximum positive element of KT2j. If at any stage negative regression coefficients are found, the predictor with the most negative bj is dropped from the model. In this way predictors are added and dropped until the Kuhn-Tucker conditions are satisfied. Lawson & Hanson (1974) proved that this stepwise method always finds the model with minimal sums of squares. Their proof can be generalized to show that the minimum will be found irrespective of the initial model used.
McDonald & Diamond (1990) show that the Kuhn-Tucker values for generalized linear models are given by
[XT (y – μ) {V(μ) ∂η/∂μ}-1]
where μ is the mean, V(μ) the variance function and η the linear predictor. These values can be calculated as follows
RKEEP ITERATIVEWEIGHTS=iter; YADJUSTED=yadj;\
LINEARPREDICTOR=lin
CALCULATE kuhntuck = X * iter * (yadj - lin)
If the log-likelihood is strictly concave, as is usually the case for generalized linear models, the generalized Kuhn-Tucker conditions are necessary and sufficient and the iterative procedure finds the minimum of the constrained optimization problem. To increase numerical precision for generalized linear models, the procedure sets the TOLERANCE
option of the RCYCLE
directive to 1.0e-6.
Calculation of the Kuhn-Tucker conditions can be subject to considerable rounding errors. Therefore, before starting the stepwise procedure, the predictors are standardized. Moreover, the response and the fitted values are scaled identically before they are subtracted in the calculation of the Kuhn-Tucker values. Due to rounding errors, an aliased predictor may have a Kuhn-Tucker value slightly larger than 0 and may consequently enter the model. The Kuhn-Tucker values KT2j are therefore not tested against 0 but against the setting of the TOLERANCE
option. Subsequent to the iterative procedure, aliased predictors, identified as having zero estimates and zero standard errors of estimates, are removed from the model. In the final fit the original non-standardized predictors are used.
Action with RESTRICT
Any restriction applied to vectors used in the regression model applies also to the results from RNONNEGATIVE
.
References
Kennedy, W.J. & Gentle, J.E. (1980). Statistical Computing. Marcel Dekker, New York.
Lawson, C.L. & Hanson, R.J. (1974). Solving Least Squares Problems. Prentice & Hall, New York.
McDonald, J.W. & Diamond, I.D. (1990). On the fitting of generalized linear models with nonnegativity parameter constraints. Biometrics, 46, 201-206.
Ramsay, J.O. (1988). Monotone regression splines in action. Statistical Science, 3, 425-461.
See also
Commands for: Regression analysis.
Example
CAPTION 'RNONNEGATIVE example',\ 'Data from Section 8.3 of Kennedy & Gentle (1980).';\ STYLE=meta,plain VARIATE [NVALUES=20] x[1...3],y READ x[1...3],y 0.4 94 39 378.4 7.5 63 39 270.3 6.7 77 85 310.0 2.7 65 50 284.2 9.3 8 44 99.0 8.9 17 35 129.3 5.6 41 43 206.0 8.3 31 1 176.0 8.5 33 20 179.0 8.9 39 16 196.4 10.0 71 60 285.2 1.9 66 78 286.1 6.3 88 0 351.9 8.9 77 15 311.6 0.6 54 37 256.7 5.1 89 35 352.9 8.8 63 64 266.9 2.8 57 6 266.1 7.1 66 26 282.6 3.5 22 53 152.9 : MODEL y RNONNEGATIVE [PRINT=monitoring,estimates] x[] CALCULATE x[4] = -x[1] RNONNEGATIVE [PRINT=monitoring,estimates ; INITIALMODEL=full] x[]