Declares one or more tree data structures and initializes each one to have a single node known as its root.
No options
IDENTIFIER = identifiers |
Identifiers of the trees |
declares and initializes Genstat tree structures. These can be used to represent hierarchical structures like classification trees, identification keys and regression trees. These types of tree can be constructed by special-purpose procedures BCLASSIFICATION
, respectively, and displayed by procedures BGRAPH
. Most users will use only these special-purpose procedures, and will not need to operate on trees directly, nor to be aware of how they are formed, stored or manipulated. The procedures, however, are based on a suite of directives, functions and procedures summarized below, which provide the tool kit not only for the officially-supported tree facilities but also for user enhancements and extensions.
The tree structure is like a real tree, which starts from a root and then splits into branches, except that it is usually viewed as growing downwards instead of upwards. The branch-points in the tree are known as nodes, with the initial node being called the root (as in a real tree). There is also a node at the end of each branch, known as its terminal node. In Genstat a tree is similar to a pointer, with an element for each node. These elements are the identifiers of data structures which can be used to store information about the nodes. Usually the data structures will be pointers, so that several pieces of information can be stored for each node, but the precise contents depend on the type of tree (see, for example, procedures BCLASSIFICATION
Each node thus has a number, corresponding to the index of its element in the tree. The root is always numbered one, and this is the only node that the tree contains when it is declared by TREE
. Further nodes can be added by the BGROW
directives, which form branches from a terminal node or join another tree to a terminal node, respectively. The converse process of cutting a tree at a defined node and discarding the nodes and information below it is provided by the BCUT
The numbers of the subsequent nodes can be obtained from the functions that are provided to navigate around a tree:
provides the numbers of the nodes below a node; |
provides the number of the node immediately above a node; |
finds the next terminal node after a node; |
finds the number of the node immediately after a node in a standard branch-by-branch order that visits each node once. |
Other useful functions include:
provides the number of branches below a node; |
calculates the depth of a node (taking the root as being at depth 1); |
provides a variate containing the numbers of the nodes on the branch to a node; |
provides a variate containing the numbers of the branches taken on the path to a node; |
provides a variate containing numbers of all the nodes or all the terminal nodes below a node; |
provides the number of nodes in a tree; |
provides the maximum node number in a tree. |
There are also several utility procedures, which are used by the special-purpose tree procedures.
constructs a tree (using subsidiary procedure BSELECT , which is customized according to the type of tree). |
plots a tree. |
displays a tree. |
prunes a tree using minimal cost complexity (assuming that “accuracy” values have been stored at each node of the tree, which can be done using customized procedure BVALUES ). |
New tree-based analyses can thus be added by writing a main procedure (like BCLASSIFICATION
etc), and defining appropriate versions of BSELECT
Options: none.
See also
Directives: BASSESS
Procedures: BCONSTRUCT
Functions: BBELOW
Commands for: Data structures.
" Examples 1:4.12.3, 1:4.12.4 & 1:4.12.5 " " Declare the original tree." TREE T " Define texts to use as labels for the nodes." TEXT Lab[1...26]; VALUES=\ 'a','b','c','d','e','f','g','h','i','j','k','l','m',\ 'n','o','p','q','r','s','t','u','v','w','x','y','z' " Define information at root to be a pointer with a single element called 'label'." POINTER [NVALUES=!t(label)] T[1] " Set that element to the first value of Lab, i.e. 'a'." TEXT T[1]['label']; VALUE=Lab[1] " Display the tree - first with labels of nodes, then with numbers." BPRINT [PRINT=labelleddiagram,numbereddiagram] T " Extend the tree by forming 3 branches from node 1 (root)." BGROW T; NODE=1; NBRANCH=3; NEWNODES=Gnew " Define the information for the new nodes." POINTER [NVALUES=!t(label)] T[#Gnew] TEXT T[#Gnew]['label']; VALUE=Lab[#Gnew] " Display the extended tree." BPRINT [PRINT=labelleddiagram,numbereddiagram] T " Find the node number of the first terminal node " CALCULATE N1 = BTERMINAL(T; 0) " and then the second terminal node." CALCULATE N2 = BTERMINAL(T; N1) PRINT N1,N2; DECIMALS=0 " Extend the tree by adding 2 branches at the second and then the first terminal node." BGROW T; NODE=N2; NBRANCH=2; NEWNODES=Gnew2 " Define the information for the new nodes." POINTER [NVALUES=!t(label)] T[#Gnew2] TEXT T[#Gnew2]['label']; VALUE=Lab[#Gnew2] BGROW T; NODE=N1; NBRANCH=2; NEWNODES=Gnew1 POINTER [NVALUES=!t(label)] T[#Gnew1] TEXT T[#Gnew1]['label']; VALUE=Lab[#Gnew1] " Display the extended tree." BPRINT [PRINT=labelleddiagram,numbereddiagram] T "4.12.4 Remove the branches below N2, saving these as tree T2; also save and print the node mapping variates." BCUT T; NODE=N2; CUTTREE=T2; OLDNODES=Oldn;\ NEWNODES=Newn; CUTNODES=Cutn PRINT [ORIENT=across] Oldn,Newn,Cutn; FIELD=3; DECIMALS=0 " Display the modified tree, and the cut-tree." BPRINT [PRINT=labelleddiagram,numbereddiagram] T,T2 " Redefine the root of the cut tree so that it no longer shares the same information pointer as the node where the cut was made in the original tree." POINTER [NVALUES=!t(label)] T2root TEXT T2root['label']; VALUE='t2root' ASSIGN T2root; T2; 1 BPRINT [PRINT=labelleddiagram] T2 " Use BCUT to form T3 as a duplicate of T but with renumbered nodes." BCUT [RENUMBER=yes] T; NEWTREE=T3 BPRINT [PRINT=labelleddiagram,numbereddiagram] T3 "4.12.5 Join tree T2 onto node 4 of T3; save and print the numbers of the joined nodes in the revised tree." BJOIN T3; NODE=4; JOINTREE=T2; NEWNODES=Jnew BPRINT [PRINT=labelleddiagram,numbereddiagram] T3 PRINT Jnew; DECIMALS=0 " Tree functions: all nodes below node 2," CALCULATE Below = BBELOW(T3; 0; 0) PRINT Below; DECIMALS=0 " all terminal nodes below node 2," CALCULATE Below0 = BBELOW(T3; 0; 0) PRINT Below0; DECIMALS=0 " first three terminal nodes," CALCULATE N1 = BTERMINAL(T3; 0) & N2 = BTERMINAL(T3; N1) & N3 = BTERMINAL(T3; N2) PRINT N1,N2,N3; DECIMALS=0 " nodes and branches on path to N3," CALCULATE Pn3 = BPATH(T3; N3) & Ln3 = BBRANCHES(T3; N3) PRINT Pn3,Ln3; DECIMALS=0 " depth and number of branches at node 2," CALCULATE Nn2 = BNBRANCHES(T3; 2) & Dn2 = BDEPTH(T3; 2) PRINT Nn2,Dn2; DECIMALS=0 " next nodes on branches 1-3 from node 1, and branch 1 from node 2." PRINT BNEXT(T3; 1; 1); DECIMALS=0 PRINT BNEXT(T3; 1; 2); DECIMALS=0 PRINT BNEXT(T3; 1; 3); DECIMALS=0 PRINT BNEXT(T3; 2; 1); DECIMALS=0 " Scan the tree, taking the nodes in standard order." SCALAR Scan[0]; value=0 CALCULATE Scan[1] = BSCAN(T3; Scan[0]) & Scan[2] = BSCAN(T3; Scan[1]) & Scan[3] = BSCAN(T3; Scan[2]) & Scan[4] = BSCAN(T3; Scan[3]) & Scan[5] = BSCAN(T3; Scan[4]) & Scan[6] = BSCAN(T3; Scan[5]) & Scan[7] = BSCAN(T3; Scan[6]) & Scan[8] = BSCAN(T3; Scan[7]) & Scan[9] = BSCAN(T3; Scan[8]) PRINT Scan[1...9]; FIELD=8; DECIMALS=0