Performs a generalized Procrustes analysis (G.M. Arnold & R.W. Payne).

### Options

`PRINT` = string tokens |
Printed output required (`analysis` , `centroid` , `column` , `individual` , `monitoring` ); default `anal` , `cent` |
---|---|

`SCALING` = string token |
Type of scaling to use (`none` , `isotropic` , `separate` ); default `none` |

`METHOD` = string token |
Method to be used (`Gower` , `TenBerge` ); default `Gowe` |

`NROOTS` = scalar |
Number of roots (i.e. dimensions) to print for the output configurations, consensus and rotation matrices, and number of dimensions to save with the `XOUTPUT` , `CONSENSUS` and `ROTATIONS` paramaters if their matrices have alread not been defined; default is to print and save all the dimensions |

`PLOT` = string tokens |
Controls which graphs to display (`consensus` , `individuals` , `projections` ); default `*` i.e. none |

`NDROOTS` = scalar |
Number of dimensions to display in the consensus and individuals plots; default 3 |

`TOLERANCE` = scalar |
The algorithm is assumed to have converged when (last residual sum of squares) – (current residual sum of squares) < `TOLERANCE` × (number of configurations); default 0.00001 |

`MAXCYCLE` = scalar |
Limit on number of iterations; default 50 |

### Parameters

`XINPUT` = pointers |
Each pointer points to a set of matrices holding the original input configurations |
---|---|

`XOUTPUT` = pointers |
Each pointer points to a set of matrices to store a set of final (output) configurations |

`CONSENSUS` = matrices |
Stores the final consensus configuration from each analysis |

`ROTATIONS` = pointers |
Each pointer points to a set of matrices to store the rotations required to transform each set of `XINPUT` configurations to their final (scaled) `XOUTPUT` configurations |

`RESIDUALS` = pointers |
Each pointer points to a set of matrices to store the distances of a set of scaled `XINPUT` configurations from its consensus |

`RSS` = scalars |
Stores the residual sum of squares from each analysis |

`ROOTS` = diagonal matrices |
Stores the latent roots from referring the centroid configuration to its principal axis form (consensus) for each analysis |

`WSS` = scalars |
Stores the initial within-configuration sum of squares from each analysis |

`SCALINGFACTOR` = variates |
Stores the isotropic scaling factors for configurations from each analysis |

`PROJECTIONS` = pointers |
Each pointer points to a set of matrices to store a set of projection matrices |

### Description

An *N* × *V* matrix represents a configuration of *N* points in *V* dimensions. Given a set of *M* such matrices (`XINPUT`

), a generalized Procrustes analysis iteratively matches them to a common centroid configuration by the operations of translation to a common origin, rotation/reflection of axes and possibly also scale changes. This matching seeks to minimise the sum of the squared distances between the centroid and each individual configuration summed over all points (the Procrustes statistic for each configuration and the centroid, summed over all configurations). The final centroid is referred to principal axes to give a unique consensus configuration. Two methods of scaling are available (controlled by the `SCALING`

option). Isotropic scaling, which scales the all the dimensions of each configuration by an equal amount, takes place during the Procrustes analysis. The alternative is to scale each configuration prior to the analysis so that the trace of each matrix is one (see Arnold 1992). If this latter method is used, the subsequent residuals represent pure lack-of-fit and the scaling factors given in the results represent differences in relative size/spread of the original (centred) configurations, whereas for overall isotropic scaling the scaling factor contains components of both size and lack-of-fit.

Procedure `GENPROCRUSTES`

carries out a generalized Procrustes analysis and has parameters for saving various results for future use (`XOUTPUT`

, `CONSENSUS`

, `ROTATIONS`

, `RESIDUALS`

, `RSS`

, `ROOTS`

, `WSS`

, `SCALINGFACTOR`

, `PROJECTIONS`

). There are options for different methods to use for the matching (`SCALING`

, `METHOD`

), control of convergence (`TOLERANCE`

, `MAXCYCLE`

) and printing and plotting of results (`PRINT`

, `PLOT`

, `NROOTS`

and `NDROOTS`

).

Note that the special case of *M*=2 corresponds to the classical pairwise Procrustes matching (`ROTATE`

directive) except that by fitting each configuration to a common centroid the requirement to regard one of the initial configurations as fixed is obviated.

Options: `PRINT`

, `SCALING`

, `METHOD`

, `NROOTS`

, `PLOT`

, `NDROOTS`

, `TOLERANCE`

, `MAXCYCLE`

.

Parameters: `XINPUT`

, `XOUTPUT`

, `CONSENSUS`

, `ROTATIONS`

, `RESIDUALS`

, `RSS`

, `ROOTS`

, `WSS`

, `SCALINGFACTOR`

, `PROJECTIONS`

.

### Method

The default method used for generalized Procrustes analysis in `GENPROCRUSTES`

is that described by Gower (1975). Each input configuration (`XINPUT`

– referred to henceforth as *X _{i}*,

*i*=1…

*M*) is initially column-centred, with the individual column means for each configuration optionally printed (by including the

`column`

setting with the `PRINT`

option). If separate scaling is requested (option `SCALING=separate`

), the matrices are also scaled to have trace one (see Arnold 1992). A constraint is required on the overall sum of squares to prevent the trivial solution of matching by all configurations collapsing to the origin. In this procedure the constraint used is∑ ( trace ( *X _{i}*′

*X*) ) =

_{i}*M*.

An initial estimate of the centroid is found from these centred and scaled configurations; firstly *X*_{2} is rotated to *X*_{1}, with the rotated *X*_{2} saved as the new *X*_{2} and the centroid computed as the mean of *X*_{1} and the new *X*_{2}; *X*_{3} is rotated to this centroid which is then recalculated as the mean of the three current configurations; and so on until all configurations *X _{i}* (

*i*=1…

*M*) have been included. The centroid thus found is taken as the initial centroid estimate

*Y*, with the rotated values as the new

*X*. The initial residual sum of squares

_{i}*S*is calculated as

_{r}*Sr* = *M* × ( 1 – trace ( *Y′* *Y* )).

Each of the current configurations *X _{i}* is then rotated to

*Y*and the rotated position saved as the new

*X*. The updated estimate of the centroid

_{i}*Y*is calculated as the mean of the new

_{n}*X*(i=1…M) and the new residual sum of squares calculated as

_{i}*Sr _{n}* =

*M*× ( 1 – trace (

*Y*′

_{n}*Y*)).

_{n}If isotropic scaling has been requested (option `SCALING=isotropic`

) new estimates *ro _{i}*′ of the individual scaling factors

*ro*(originally set to 1) are now found by

_{i}*ro _{i}*′/

*ro*= √( trace(

_{i}*X*′

_{i}*Y*)/( trace(

_{n}*X*′

_{i}*X*) × trace(

_{i}*Y*′

_{n}*Y*)))

_{n}and each *X _{i}* is updated by a factor of

*ro*′/

_{i}*ro*. The centroid is then recalculated as the mean of the new

_{i}*X*and the new residual sum of squares calculated in a similar manner to before. If the change in residual

_{i}*Sr*is less than a preset tolerance (controlled by option

`TOLERANCE`

) the algorithm is taken to have converged. If not, the process is repeated until the tolerance is reached, up to a maximum number of iterations as set by the option `MAXCYCLE`

(default 50) after which a message of non-convergence is printed and the procedure terminated. Monitoring information about convergence can be printed by including the `monitoring`

setting with the `PRINT`

option.After convergence a unique consensus configuration is found by referring the final centroid to principal axes; the corresponding latent roots may be saved using the `ROOTS`

parameter. Final results for the consensus and individual configurations (referred to the same principal axes) may be printed using the `centroid`

and `individual`

settings of the `PRINT`

option, and/or saved using the parameters `XOUTPUT`

, `CONSENSUS`

and `ROTATIONS`

. By default, results are presented and saved for the maximum available dimensionality but the option `NROOTS`

allows a reduced number of dimensions to be set. Analysis of variation for the *M* configurations (including the individual scaling factors) and for the *N* points, along with the initial within and between configurations sums of squares (*WSS* and *BSS*), the final residual sum of squares (*RSS*) and number of steps in the iteration process may be printed using the `analysis`

setting of the `PRINT`

option. The initial within-configuration sum of squares, final residual sum of squares and individual isotropic scaling factors may also be saved using, respectively, the `WSS`

, `RSS`

and `SCALINGFAC`

parameters. (Note that the final results are still scaled by the original factor from the initial overall constraint; to return to the original scale all sums of squares need adjustment by a factor of *WSS*/*M* and configurations by the square root of that factor).

Independently of the choice of dimensionality for printing and saving, the `NDROOTS`

option controls the dimensionality of the graphical output requested using the `PLOT`

option (default 3). The `consensus`

setting plots the consensus solution in the chosen dimensionalty, and the `individual`

setting gives the individual final configurations as well as the consensus. The `projection`

setting displays the projections (calculated from the individual rotation matrices scaled by the singular values from the consensus solution in principal axis form) as vectors labelled by configuration number and colour-coded for order of column. This projection plot can be particularly helpful in comparing the use of terms/attributes (columns of the configurations) by individual assessors in sensory analysis, both in conventional and free-choice profiling; see Arnold & Collins (1993) for further details.

Modifications to the method described above are given in TenBerge (1975), and may be invoked by the `TenBerge`

setting of the `METHOD`

option. This may give considerable savings in the time to reach convergence (Arnold 1988).

### References

Arnold, G.M. (1988). Comparisons of algorithms for generalized Procrustes analyses. *Genstat Newsletter*, 22, 7-11.

Arnold, G.M. (1992). Scaling factors in generalized Procrustes analysis. *Computational Statistics, Volume 1, Proceedings of the 10th Symposium on Computational Statistics, COMPSTAT, Neuchatel, Switzerland, August 1992*, 61-66.

Arnold, G.M. & Collins, A.J. (1993). Interpretation of transformed axes in multivariate analysis. *Applied Statistics*, 42, 381-400.

Gower, J.C. (1975). Generalized Procrustes analysis. *Psychometrika*, 40, 33-51.

TenBerge, J.M.F. (1977). Orthogonal Procrustes rotation for two or more matrices. *Psychometrika*, 42, 267-276.

### See also

Directives: `ROTATE`

. `FACROTATE`

.

Procedures: `PCOPROCRUSTES`

, `SAGRAPES`

.

Commands for: Multivariate and cluster analysis.

### Example

CAPTION 'GENPROCRUSTES example',!t('Data from',\ 'Gower (1975), Psychometrika, 40, pages 33-51.',\ 'Note, however, that in Table 3 the scaling factors printed',\ 'were SQRT(ro[i]) instead of ro[i], and in Table 4 the',\ 'Between and Within Judges sums of squares were transposed.');\ STYLE=meta,plain MATRIX [ROWS=9; COLUMNS=7] X[1...3] READ [SERIAL=yes] X[] 47 44 49 38 35 40 40 72 45 41 77 72 73 35 61 49 40 58 58 62 30 66 56 45 55 53 46 30 37 72 50 27 30 33 25 76 76 53 81 79 75 45 64 59 51 72 61 66 40 21 70 43 27 22 26 20 71 70 34 72 72 71 35 : 31 39 33 29 48 38 42 30 60 36 22 36 34 39 27 55 30 18 28 22 42 48 52 53 27 21 30 31 20 55 28 22 33 27 35 21 42 31 46 76 33 42 30 52 53 35 44 30 44 5 57 53 12 13 6 31 55 63 53 77 79 57 49 : 43 46 44 22 53 44 29 53 79 75 79 73 52 27 22 85 83 19 27 17 22 28 89 78 13 29 20 24 75 86 85 34 75 55 38 53 79 82 72 78 74 38 15 85 85 46 75 52 35 5 95 95 3 20 2 24 27 78 85 89 92 81 41 : GENPROCRUSTES [PRINT=analysis,centroid,column,individual,monitoring;\ SCALING=isotropic] XINPUT=X