eu ON THE SYMMETRICAL PROPERTY OF PROCRUSTES MEASURE OF DISTANCE

Measurement the degree of difference between two matrices by using Procrustes analysis is preceded by a series of Euclidean similarity transformations namely translation, rotation, and dilation, performed in respected order, for gaining maximal matching. It is easy, by a counter example, to show that Procrustes measure does not obey the symmetrical property, something should be owned by any distance function. In this paper we analytically proved that normalization over configuration matrices as an additional transformation results in the satisfaction of the symmetrical property by Procrustes analysis. We also proved that normalization can be undertaken prior to or after rotation to preserve symmetrical property. Moreover, we proved that Procrustes measure can be expressed in term of singular values of the matrix. We here very much exploited the characteristic of full singular value decomposition under similarity transformations. AMS Subject Classification: 00A35, 62H20

and scaling effects are filtered out from an object.Two configuration matrices as representation of objects have the same shape if they can be translated, rotated, and dilated to each other so that they match exactly [3].Procrustes analysis refers to a technique of comparing objects with different shapes and producing a measure of the match.It eliminates possible incommensurability of variables within the individual data sets and size differences between data sets by employing data and configuration scalings in calculating the distance, respectively.Thus, a series of Euclidean similarity transformations, i.e., translation-rotation-dilation, is carried-out to best match up to their maximal agreement.Initially, the difference d between matrices X = (x ij ) and Y = (y ij ) in R m×n can be quantified by We mean by translation a moving process of all elements of matrix within fixed distance and into the same direction with respect to its centroid.We can minimize the distance between X and Y after translation by coinciding their centroids at the origin, without loss of generality.Thus, matrices X and Y after optimal translation are given by where Rotation is a process of moving all elements of matrix under the fixed rotation angle without any changes in the point-to-centroid distances.Rotation on Y T over X T is carried-out by post-multiplying Y T by a rotation matrix Q.The minimum distance after rotation is obtained by selecting Dilation is a data scaling which stretches or shrinkages the point-to-centroid distance in a configuration by multiplication of a fixed scaling factor.Dilation on Simultaneously, it is proved in [1] that a series of similarity transformations in the order of translation-rotation-dilation provides the lowest possible distance as stated in the following theorem, where the distance is calculated according to (1) as follows Theorem 1.Given two matrices X and Y in R m×n , the Procrustes measure p between X and Y after optimal translation-rotation-dilation process is provided by Proof.For complete proof, see [1].

Singular Value Decomposition
In this section we provide a short review on the singular value decomposition, which could be described as the most useful tool in matrix algebra because this factorization can be applied to any real or complex matrix, square or rectangular.

Theorem 2 ([4]
).If A is an m × n matrix of rank r ≤ min{m, n}, then there exists orthogonal m×m matrix U (respectively unitary), orthogonal n×n matrix W (respectively unitary), and m × n matrix S = (s ij ) such that Suppose that eigenvalues of AA ′ can be ordered as 0 with the corresponding eigenvectors u i (i = 1, . . ., r, r + 1, . . ., m) and those of 0 with the corresponding eigenvectors w i (i = 1, . . ., r, r + 1, . . ., n).To comply with (5), it is required that either must be satisfied, with s ii = √ λ i .The s ii is called the singular value of A, which is defined to be the positive square root of the ranked eigenvalue of AA ′ or A ′ A (note that AA ′ and A ′ A posses the same set of positive eigenvalues).Then it can be written that where ∆ r = diag(s ii , . . ., s rr ) is an r × r diagonal matrix and 0 is a matrix of zeros with compatible dimension.Note that we have S in the form of (8) when r < m and r < n.If r = m = n then it reduces to S = ∆ r .If r = m < n then S = (∆ 0), and if r = n < m then S = (∆ 0) ′ .Since u i (i = 1, . . ., r) and w i (i = 1, . . ., r) are, respectively, eigenvectors correspond to positive eigenvalues of AA ′ and A ′ A, then U and W should be constructed by orthonormal columns.In fact, if the number of orthonormal columns is less than m or n then we should expand this set to orthonormal basis.If zero eigenvalues are included along with their corresponding eigenvectors, then we have an FSVD of matrix A as given by Theorem 2.
As an illustrative example of FSVD let consider a 4 × 3 matrix A of rank 2: Matrix A ′ A possesses eigenvalues λ 1 = 12 and λ 2 = 10 with corresponding eigenvectors construct the following W matrix: While, matrix AA ′ has eigenvalues λ 1 = 12, λ 2 = 10, and λ 3 = 0.According to (6), we obtain By expanding {u 1 , u 2 } to an orthonormal basis in R 3 and by exploiting Gram-Schmidt orthonormalization procedure [4] we get Thus, the corresponding eigenvectors of AA ′ form the following U matrix: Obviously, we can avoid basis expansion and Gram-Schmidt orthonormalization by initially starting with AA ′ .But in this way we have to solve the third order characteristic equation.
Further, it is easy to verify that both U and W are orthonormal, i.e., U U ′ = U ′ U = I 3 and W W ′ = W ′ W = I 2 .Since m = 4, n = 3, and r = 2, then the matrix S is provided by Alternatively, the FSVD with economic size, i.e., reduced order FSVD, can be obtained by removing columns and rows relate to zero eigenvalues, i.e.,

Symmetrical Property
A distance is a function which associates to any pair of vectors a real nonnegative number.The notion of distance is essential because many statistical techniques, such as principal component analysis, multidimensional scaling analysis, and correspondence analysis, are equivalent to the analysis of a specific distance table .A distance function possesses the following properties [4].
Definition 3.For any vectors x, y, and z in vector space S, a function d is a distance function on S if the following properties are satisfied: The concept of distance over vectors might be adopted for calculating the degree of difference of matrices of the same dimension.For instance, to any matrices X = (x ij ) and Y = (y ij ) in R m×n , we may defined the difference between them as (1), as embraced by Procrustes measure within translationrotation-dilation process.However, it is easy to show that Procrustes measure does not comply with distance properties provided in Definition 3. By taking two different matrices from (4) we have p(X, Y ) = 0, which reveals a violation of the second property.This fact, however, provides another kind of idiosyncrasies of matrix algebra, in addition to the well-known fact that AB = 0 can happen although neither A nor B is itself a zero matrix.Another pair of matrices produces p(X, Y ) = 34.67 and p(Y, X) = 47.13, which shows that in general Procrustes measure disobeys the third property, i.e., the symmetrical property.This paper deals with the embedding of the symmetrical property of Procrustes measure.Its primary objective is to ascertain that an additional transformation, namely a normalization, can set the symmetrical property into Procrustes measure.In fact, normalization is already considered by Matlab built-in function procrustes(X,Y).Pre-scaling to normal matrices are also employed by [3] in calculating Procrustes distance by using shape analysis approach.We here provide the proof of the symmetrical property and discuss the effect of normalization place in order of the transformation sequences.We greatly exploit the characteristic of FSVD under similarity transformations.
Let given two data matrices X and Y with the same number of objects.If the number of variables of both matrices is not the same, we may place a number of columns of zeros either in the last or anywhere such that the dimension of matrices are the same [6].Thus, without loss of generality we assume that X and Y are in the same space, i.e., X, Y ∈ R m×n .
In this analysis, we mean by normalization making the norm of matrix one by dividing it by its Frobenius norm.We introduce a normalization procedure following a translation.Hence, the normalization over X T and Y T produces XT = aX T , ȲT = bY T , where with X T and Y T are given by (2).In (13) we denote by • F the Frobenius norm of a matrix, hence XT F = ȲT F = 1.
Theorem 4. Given two matrices X and Y in R m×n , the Procrustes measure p between X and Y after optimal translation-normalization-rotationdilation process obeys the symmetrical property and is provided by where r and σ ii (i = 1, . . ., r) respectively are the rank and singular values of X′ T ȲT or Ȳ ′ T XT with XT and ȲT are given in (12).
Proof.By performing optimal rotation over XT and ȲT and subsequently optimal dilation, then by considering (1) we obtain the Procrustes measure of X and Y as follows where We can expand the above equation to get a quadratic form in c: By optimally selecting and substituting it back into quadratic form in (15) alongside with ( 12)-(13) we have Note that the last expression is obtained by remembering that We now consider the opposite direction, where the matrix X should be fitted by similarity transformations for obtaining the best matching to the observed matrix Y .In other words, we aim to calculate the Procrustes measure p(Y, X).By following the similar technique, we may shortly arrive at where where r and σ ii (i = 1, . . ., r) respectively are the rank and singular values of X′ T ȲT or Ȳ ′ T XT .Thus, we proved the symmetrical property of Procrustes measure, i.e., p(X, Y Remark 5. Procrustes measure p in ( 14) is basically calculated by using (1).Since tr A ′ A ≥ 0 for any matrix A, then it is guaranteed that the nonnegativity property of a distance function is satisfied, i.e., p(X, Y ) ≥ 0. Furthermore, suppose that there exists matrices X and Y such that p(X, Y ) > 1.Then we have 1 − ( r i=1 σ ii ) 2 > 1, or equivalently ( r i=1 σ ii ) 2 < 0, which is a contradiction.It must be the case that p(X, Y ) ≤ 1.As a consequence of these facts, we conclude 0 ≤ p(X, Y ) ≤ 1 for any matrices X and Y of the same dimension.Result presented in Theorem 4 is in agreement with that of [3].Remark 6.Note that the optimal scaling parameter c in (16) can further be written as Remark 7. Suppose that we have the following FSVD: from which we conclude Remark 8.By ( 14), if we have p(X, Y ) = 0 then it means that X and Y have exactly the same shape.Conversely, X and Y with p(X, Y ) = 1 have entirely different shape.In this regard we call p a dissimilarity measure.If we would like to have Procrustes measure ρ with opposite interpretation, then we define ρ(X, Y ) := 1 − p(X, Y ) = ( r i=1 σ ii ) 2 .It means that the bigger the coefficient ρ, the closer the similarity of matrices X and Y .Thus we call ρ a similarity measure which can conduct as a goodness-of-fit of the best matching.

When Should We Normalize?
In the previous section we have analyzed the incorporating normalization as an additional similarity transformation into the procedure.It is shown that by normalizing the matrices after translation, the symmetrical property is asserted by Procrustes measure.In this section we examine the situation where normalization is performed sequentially different.As an illustrative example, let consider again matrices X and Y in (11).Table below provides where the last expression is came from (19).Equation (20), however, proves that Procrustes measures under T-N-R-D and T-R-N-D are the same.Obviously, we can also prove the symmetrical property similarly.
From the table we can also see that when normalization is applied in the first stage, i.e., we consider N-T-R-D, then the symmetrical property is generally not satisfied, even though this sequence may provide a smaller measure.While, when we put normalization in the last stage, i.e., we consider T-R-D-N, then we get a different magnitude of measure with the symmetrical property convinced.However, the measure is not always staying in [0, 1] interval, something which is not expected for a goodness of fit coefficient.In fact, Procrustes measure under T-R-D-N scheme is given by p(X, Y ) = tr(aX T − bY T Q) ′ (aX T − bY T Q) = 2 (1 − ab tr Σ) . (21)

abΣ 3 , 2 r i=1 δ ii 2 .
and furthermore we have the relation between singular values σ ii = abδ ii , where δ ii is the singular value of X ′ T Y T .Thus we can alternatively rewrite (14) as p(X, Y ) = p(Y, X) = 1 − a 2 b Procrustes measure under different sequence of transformations, e.g., we mean by T-N-R-D the degree of difference between matrices is calculated after translation (T), normalization (N), rotation (R), and dilation (D), respectively.No Transformations p(X, Y ) p(Y, X) is shown that normalizing prior to rotation (T-N-R-D) or after rotation (T-R-N-D) provides a consistent result and guarantees the symmetrical property.It can generally be explained as follows.After translation and rotation we have X T and Y T Q 3 , whereQ 3 = V 3 U ′ 3 and U 3 Σ 3 V ′ 3 is the FSVD of X ′ T Y T .Normalizing both matrices provides XT = aX T as in (12) and ZT :=Y T Q 3 Y T Q 3 F = bY T Q 3 .The procrustes measure after optimal dilation is achieved by selecting c = tr X′ T ZT tr Z′ T ZT with p(X, Y ) = 1 − tr 2 X′ T ZT = 1 − tr 2 (abΣ 3 ) = 1 − tr 2 Σ 1 ,