Makes predictions using a regression tree (R.W. Payne).
Options
PRINT = string tokens |
Controls printed output (prediction, transcript); if PRINT is unset in an interactive run BRPREDICT will ask what you want to print, in a batch run the default is pred |
|---|---|
TREE = tree |
Specifies the tree |
PREDICTIONS = variate |
Saves the prediction for the observations |
TERMINALNODES = pointer |
Saves the numbers of the terminal nodes from which each prediction was obtained |
MVINCLUDE = string token |
Whether to provide predictions for units with missing or unavailable values of the x-variables (explanatory); default expl |
Parameters
X = variates or factors |
Explanatory variables |
|---|---|
VALUES = scalars, variates or texts |
Values to use for the explanatory variables; if these are unset for any variable, its existing values are used |
Description
BRPREDICT makes predictions using a regression tree, as constructed by the BREGRESSION procedure. The tree can be saved from BREGRESSION (using the TREE option of BREGRESSION), and specified for BRPREDICT using its own TREE option. Alternatively, BRPREDICT will ask you for the identifier of the tree if you do not specify TREE when running interactively.
The x-values for the predictions can be specified in the variates or factors listed by the X parameter. These must have identical names (and levels) to those used originally to construct the tree. You can use the VALUES parameter to supply new values, if those stored in any of the variates or factors are unsuitable.
If you do not set X when running interactively, BRPREDICT will ask you to supply the relevant x-values in turn, as required by the tree. Otherwise, if an x-variable in the tree is not specified in the X parameter list, its values are assumed to be unavailable (i.e. missing).
By default, when the x-variable required at a node in the tree is unavailable or contains a missing value, BRPREDICT will follow all the branches from that node, and average the predictions that they generate. You can set option MVINCLUDE=*, if you would prefer the prediction to be missing.
The PRINT option controls printed output, with settings:
prediction |
prints the predictions obtained using the tree; |
|---|---|
transcript |
prints the x-values supplied in response to questions in an interactive run. |
If you do not set PRINT in an interactive run, BRPREDICT will ask what you would like to print. In batch, the default is to print the predictions.
You can save the predictions, in a variate, using the PREDICTIONS option. The TERMINALNODES option allows you to save a pointer, with an element for each prediction, containing the numbers of the terminal nodes reached in the tree to provide the predictions. This will be a scalar if the prediction was derived from a single node, or a variate if it involved more than one (because several branches have been taken, as the result of a missing x-value).
Options: PRINT, TREE, PREDICTIONS, TERMINALNODES.
Parameters: X, VALUES.
Method
BRPREDICT uses BIDENTIFY to find the terminal nodes of the tree that correspond to the values of the explanatory variables.
Action with RESTRICT
Restrictions are ignored.
See also
Procedures: BREGRESSION, BRKEEP, BRDISPLAY.
Commands for: Regression analysis, Multivariate and cluster analysis.
Example
CAPTION 'BRPREDICT example',!t('Water usage data (Draper & Smith 1981,',\
'Applied Regression Analysis, Wiley, New York).'); STYLE=meta,plain
READ temp,product,opdays,employ,water
58.8 7.107 21 129 3.067
65.2 6.373 22 141 2.828
70.9 6.796 22 153 2.891
77.4 9.208 20 166 2.994
79.3 14.792 25 193 3.082
81.0 14.564 23 189 3.898
71.9 11.964 20 175 3.502
63.9 13.526 23 186 3.060
54.5 12.656 20 190 3.211
39.5 14.119 20 187 3.286
44.5 16.691 22 195 3.542
43.6 14.571 19 206 3.125
56.0 13.619 22 198 3.022
64.7 14.575 22 192 2.922
73.0 14.556 21 191 3.950
78.9 18.573 21 200 4.488
79.4 15.618 22 200 3.295 :
"form the regression tree"
BREGRESSION [PRINT=*; Y=water; TREE=tree] employ,opdays,product,temp
"prune the tree"
BPRUNE [PRINT=table] tree; NEWTREES=pruned
"use tree 6 - renumber nodes"
BCUT [RENUMBER=yes] pruned[6]; NEWTREE=tree
"display the tree"
BRDISPLAY [PRINT=labelled] tree
"predict water usage and compare with the original data values"
BRPREDICT [PRINT=*; TREE=tree; PREDICTION=prediction]\
employ,opdays,product,temp
PRINT water,prediction; FIELD=8,12