Select menu: Stats | Regression Analysis | Regression Trees
Use this to form a regression tree, which is a mechanism for predicting a response variable from a set of independent variables. The construction process splits the observations into subsets, according to whether or not they are less than a particular value of one of the independent variates. The aim is to form subsets that have similar values for the response variate.
- After you have imported your data, from the menu select
Stats | Regression Analysis | Regression Trees.
OR
Stats | Multivariate Analysis | Trees | Regression Tree. - Fill in the fields as required then click Run.
You can set additional options before running by clicking Options.
The predicted value of the response variable for each node of the tree is the mean of its value for the subset of observations at that node. The accuracy of the node is the squared distance of the values of the response variate from their mean for the observations at the node, divided by the total number of observations. The potential splits at the node are assessed by their effect on the accuracy, that is the difference between the accuracy of the node and the sum of the accuracies of the two potential successor nodes. The node will become a terminal node if none of the splits provides any improvement in accuracy, or if the mean square of the observations at the node is less than a specified limit.
Available data
This lists data structures appropriate to the current input field. The contents will change as you move from one field to the next. Double-click a name to copy it to the current input field; alternatively, you can type the name directly into the input field.
Y-variate
Specify a response variate for the regression.
X-variables
Specifies the independent (x) variables available for constructing the tree. You can transfer multiple selections from Available data by holding the Ctrl key on your keyboard while selecting items, then click to move them all across in one action..
Save tree in:
Specifies an identifier name to save the resulting tree in. The tree will be saved within a Genstat Tree data structure.
Prune
The construction of a regression tree generally results in over fitting, that is it continues to extend the branches of the tree beyond the point that can be justified statistically. One solution is to prune the tree to remove the uninformative sub-branches. Clicking the Prune button opens a menu where you can prune the tree.
See also
- Regression Trees Options
- Trees Further Output
- Classification Trees menu
- Tree Prune menu
- BREGRESSION procedure in command mode