Performs lasso using iteratively reweighted least-squares (D.A. Murray & P.H.C. Eilers).
Options
PRINT = string token |
What output to print (estimates , best , crossvalidation , progress , correlation , fitted , monitoring ); default best |
---|---|
PLOT = string tokens |
What graphs to plot (correlation , coefficients ); default * i.e. none |
TERMS = formula |
Explanatory model |
FACTORIAL = scalar |
Limit on number of factors/covariates in a model term; default 3 |
LAMBDA = variate or scalar |
Values for the parameter lambda; must be set |
VALIDATIONMETHOD = string token |
Which cross-validation method to use (crossvalidation , gcv ); default gcv |
NCROSSVALIDATIONGROUPS = scalar |
Number of groups for k-fold cross-validation; default 10 |
NBOOT = scalar |
Number of times to bootstrap data to estimate standard errors and confidence limits for fitted values; default 100 |
SEED = scalar |
Seed for random numbers to use in cross-validation and then in bootstrapping; default 0 |
CIPROBABILITY = scalar |
Probability level for confidence interval for fitted values; default 0.95 |
MAXCYCLE = scalar |
Maximum number of iterations for the iterative process |
TOLERANCE = variate |
Contains two values to define the convergence criterion for iterative least-squares and the adjustment to avoid division by zero in the penalty term; default !(0.0001,1e-08) |
Parameters
Y = variates |
Response variate |
---|---|
BESTLAMBDA = scalars |
Saves the optimal lambda value from cross-validation |
CVSTATISTICS = matrices |
Saves the cross-validation statistics |
RESIDUALS = variates |
Saves residuals for the optimal LAMBDA |
FITTEDVALUES = variates |
Saves fitted values for the optimal LAMBDA |
ESTIMATES = variates |
Saves parameter estimates for the optimal LAMBDA |
SE = variates |
Saves standard errors of the parameter estimates for the optimal LAMBDA |
SEFITTED = variates |
Saves standard errors of the fitted values, from bootstrapping, for the optimal LAMBDA |
LOWER = variates |
Saves lower confidence limits for the fitted values, from bootstrapping, for the optimal LAMBDA |
UPPER = variates |
Saves upper confidence limits for the fitted values, from bootstrapping, for the optimal LAMBDA |
Description
The RLASSO
procedure performs L1-penalized regression (lasso) using iteratively reweighted sums of squares. The lasso method minimizes the residual sums of squares subject to the constraint that the sum of the absolute values of the model coefficients is less than a constant or tuning parameter λ.
The response variate is specified by the Y
parameter. The model to be fitted is defined by the TERMS
option. The FACTORIAL
option sets a limit on the number of variates and/or factors in the model terms generated from the TERMS
model formula (as in the FIT
directive).
Printed output is controlled by the PRINT
option, with settings:
estimates |
to print, for each value of λ, the lasso coefficients their standard errors on the standardized and original scales. |
---|---|
best |
prints the lasso estimates for the optimal λ |
crossvalidation |
to print the cross-validation results, with optimal lambda value, |
progress |
shows the progress of the k-fold cross-validation,, |
correlation |
to print the correlations between the explanatory variables in the TERMS formula, |
fitted |
to print the fitted values for the optimal λ with their standard errors and confidence limits |
monitoring | to print monitoring information during boot strapping. |
By default,PRINT=best
.
Graphical output is controlled by the PLOT
option:
coefficients |
plots the standardized coefficient estimates against the shrinkage factor, and correlation, and |
---|---|
correlation |
uses the DCORRELATION procedure to produce a graphical representation of the correlation matrix for elements in TERMS . |
By default, nothing is plotted.
The LAMBDA
option must be set to a variate defining the values to try for the tuning parameter λ. The MAXCYCLE
option specifies the number of iterations (default 200). The TOLERANCE
option specifies the convergence criterion for the iterative procedure (default 0.0001), and the adjustment to use to avoid division by zero in the penalty term (default 10-8).
The VALIDATIONMETHOD
option controls how RLASSO
estimates the tuning parameter λ:
crossvalidation |
uses k-fold cross-validation where the prediction error is calculated using the mean squared error, |
---|---|
gcv |
uses the generalized cross-validation, as specified by Tibshirani (1996). |
By default , VALIDATIONMETHOD=gcv
.
For k-fold cross-validation the NCROSSVALIDATIONGROUPS
option defines the number of subsets to use (default 10). The data are divided into roughly equal-sized subsets and the model is fitted with each subset removed in turn. The mean squared error is calculated for the omitted subset based on the model from fitting the remaining subsets. The value that minimizes the mean prediction error is taken as the optimal λ, and used to get the lasso estimates. The optimal value of λ can be saved by the BESTLAMBDA
parameter, and the prediction error values can be saved by the CVSTATISTICS
parameter.
RLASSO
can use bootstrapping to provide standard errors and lower and upper confidence intervals for the fitted values. The NBOOT
option specifies the number of bootstrap samples that are taken, and the CIPROBABILITY
option sets the size of the confidence limits.
You can save results from the optimal fit using the RESIDUALS
, FITTEDVALUES
, ESTIMATES
and SE
, SEFITTED
, LOWER
and UPPER
parameters. Note that the residuals are the simple residuals, rather than standardized residuals.
Options: PRINT
, PLOT
, TERMS
, FACTORIAL
, LAMBDA
, VALIDATIONMETHOD
, NCROSSVALIDATIONGROUPS
, NBOOT
, SEED
, CIPROBABILITY
, MAXCYCLE
, TOLERANCE
.
Parameters: Y
, BESTLAMBDA
, CVSTATISTICS
, RESIDUALS
, FITTEDVALUES
, ESTIMATES
, SE
, SEFITTED
, LOWER
, UPPER
.
Method
Lasso is carried out by using iteratively reweighted least-squares. RLASSO
approximates the absolute sum of the coefficients ∑|β| by ∑(β2/|β|), and the penalty term λ∑(β2/|β|) is imposed on the sum of squares of the parameter estimates β. The penalty term is applied to the diagonal elements of the sums-of-squares-and-products matrix by setting the RIDGE
option of the TERMS
directive. For a given value of λ, the algorithm iterates to find the lasso estimates. The shrinkage factor s is estimated by
s = t / ∑|β(0)|
where ∑|β(0)| is the absolute sum of the full least squares estimates, and t is the absolute sum of the lasso estimates subject to
t ≤ ∑|β(0)|.
The columns of the design matrix in TERMS
are standardized. However, estimated coefficients are available for both the standardized and unstandardized data.
Action with RESTRICT
There must be no restrictions.
References
Hastie, T., Tibshirani, R. & Friedman, J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd Edition. Springer, New York.
Tibshirani, R. (1996). Regression shrinkage and selection by lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.
See also
Procedure: LRIDGE
.
Commands for: Regression analysis.
Example
CAPTION 'RLASSO example'; STYLE=meta " Prostate cancer data examining the correlation between the level of prostate-specific antigen and some clinical measures. See Tibshirani (1996), Regression and Selection by Lasso, JRSS B, 58, 267-288." SPLOAD '%GENDIR%/Examples/RLAS-1.gsh' SUBSET [train.eq.2] lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45,lpsa CALCULATE lambdas = 10**(!(1.8,1.7...-2)) RLASSO [PRINT=correlation,estimates,cross,best;\ PLOT=coefficients,correlation; LAMBDA=lambdas;\ TERMS=lcavol,lweight,age,lbph,svi,lcp,gleason,pgg45]\ Y=lpsa; BEST=optlambda; ESTIMATES=estimates; SE=se PRINT optlambda PRINT estimates,se