Forms a set of clusters from an amalgamations matrix (R.W. Payne).
Options
CLIMIT = scalar |
Similarity value below which clusters are not formed; default 0 |
---|---|
ORDERING = string token |
How to order the clusters (join , lexicographic ); default lexi |
NCLUSTERS = scalar |
Saves the number of clusters that have been formed |
Parameters
AMALGAMATIONS = matrices |
Amalgamation matrices |
---|---|
CLUSTERS = pointers |
Saves the clusters |
SIMILARITIES = variates |
Saves the similarity values at which the clusters are formed |
Description
HFCLUSTERS
can form a set of clusters for use in bootstrapping e.g. by HBOOTSTRAP
, or for labelling in a dendrogram by DCLUSTERLABELS
.
The information required to form the clusters is supplied, in an amalgamation matrix, by the AMALGAMATIONS
parameter. This can be formed by the HCLUSTER
directive for all methods except single linkage. With single-linkage cluster analysis, it can be can be formed by the HFAMALGAMATIONS
procedure, using a minimum spanning tree formed by the HDISPLAY
directive.
The clusters are saved, in a pointer, by the CLUSTERS
parameter. Each one is saved in a variate containing the numbers of the units in that cluster. (These numbers are the row or column positions of those units in the similarity matrix used by HCLUSTER
.) By default, the clusters in the pointer are sorted into lexicographic order. This puts the clusters first into increasing order of size. Then, within each size, they are arranged in an order that would correspond to alphabetic order, if the units in the clusters were represented by the letters a-z. Alternatively, you can save the clusters in the order in which they are joined (i.e. in decreasing order of the similarity at which they join) by setting option ORDERING=similarity
. Their similarities can be saved, in a variate, by the SIMILARITIES
variate. The clusters can be printed by the HPCLUSTERS
procedure.
The CLIMIT
option can specify a limit on the similarity of the clusters that are saved. Clusters that join at a similarity value less than this are excluded. The NCLUSTERS
option saves the number of clusters that are saved.
Options: CLIMIT
, ORDERING
, NCLUSTERS
.
Parameters: AMALGAMATIONS
, CLUSTERS
, NCLUSTERS
.
See also
Directives: HCLUSTER
, HDISPLAY
.
Procedures: DCLUSTERLABELS
, HBOOTSTRAP
, HFAMALGAMATIONS
, HPCLUSTERS
.
Commands for: Multivariate and cluster analysis.
Example
CAPTION 'HFCLUSTERS example',\ 'Data from the Guide to Genstat, Part 2, Section 6.1.2.';\ STYLE=meta,plain TEXT [VALUES=Estate,'Arna1.5','Alfa2.5',Mondialqc,Testarossa,\ Croma,Panda,Regatta,Regattad,Uno,X19,Contach,Delta,Thema,\ Y10,Spider] Cars POINTER [VALUES=CC,NCyl,Tank,Wt,Length,Width,Ht,WBase,TSpeed,StSt,\ Carb,Drive] Vars VARIATE [NVALUES=Cars] Vars[] READ [PRINT=errors] Vars[] 1490 4 50 966 414 161 133 245 177 10.9 1 2 1409 4 50 845 399 162 139 242 174 10.2 1 2 2492 6 49 1160 433 163 140 251 210 8.2 1 1 3185 8 87 1430 458 179 126 265 249 7.4 2 1 4942 12 120 1506 449 198 113 255 291 5.8 2 1 1995 4 70 1180 450 176 143 266 209 7.8 2 2 965 4 35 761 338 149 146 216 134 16.8 1 2 1585 4 55 970 426 165 141 244 180 10.0 1 2 1714 4 55 980 426 165 141 245 150 18.9 3 2 999 4 42 720 364 155 143 236 145 16.2 1 2 1498 4 48 912 397 157 118 220 171 11.0 1 1 5167 12 120 1446 414 200 107 245 286 4.9 1 1 1585 4 45 1000 389 162 138 247 195 8.2 1 2 1995 4 70 1150 459 175 143 266 224 7.6 2 2 1049 4 47 790 339 151 143 216 179 11.8 1 2 1995 4 45 1050 414 162 125 228 190 9.0 2 1 : SYMMETRICMATRIX [ROWS=Cars] CarSim FSIMILARITY [SIMILARITY=CarSim]\ Vars[]; TEST=4(cityblock,euclidean),2(cityblock,simplematching) HCLUSTER [PRINT=*; METHOD=average] CarSim; AMALGAMATIONS=Amalg;\ PERMUTATION=Perm DDENDROGRAM [STYLE=average; ORDERING=given; DSIMILARITY=yes] DATA=Amalg;\ PERMUTATION=Perm; LABELS=Cars; WINDOW=3 " form the clusters and save their similarities " HFCLUSTERS Amalg; CLUSTERS=Clusters; SIMILARITIES=Similarity " print the clusters with their similarities " POINTER [VALUES=Similarity] Extra HPCLUSTERS Clusters; EXTRA=Extra " plot the similarities of the clusters on the dendrogram " DCLUSTERLABELS [WINDOW=3] #Clusters; LABEL=#Similarity