Skip to contents

Introduction

As a similarity measure in grids different types of Minkowski metrics, especially the Euclidean and city-block metric are frequently used. The Euclidean distance is the sum of squared differences between the ratings on two different elements. They are, however, no standardized measure. The distances strongly depend on the number of constructs and the rating range. The figure below demonstrates this fact. Note how the distance changes although the rating pattern remains identical.

In order to be able to compare distances across grids of different size and rating range a standardization is desireable. Also, the notion of significance of a distance, i.e. a distance which is unusually big, is easier with a standard reference measure. Different suggestions have been made in the literature of how to standardize Euclidean interelement distances (Hartmann, 1992; Heckmann, 2012; Slater, 1977). The three variants will be briefly discussed and the corresponing R-Code is demonstrated.

Slater distances (1977)

Description

The first suggestion to standardization was made by Slater (1977). He essentially calculated an expected average Euclidean distance UU for the case if the ratings are randomly distributed. To standardize the grids he suggested to divide the matrix of Euclidean distances EE by this unit of expected distance UU. The Slater standardization thus is the division of the Euclidean distances by the distance expected on average. Hence, distances bigger than 1 are greater than expected, distances smaller than 1 are smaller than expected.

R-Code

The function distanceSlater calculates Slater distances for a grid.

distanceSlater(boeker)
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Slater (standardized Euclidean)
# Normalized:
#                                1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
# (1) self                  1      1.03 0.75 0.69 0.87 1.19 0.80 1.03 0.99 0.59 1.79 0.58 0.55 0.64 0.54
# (2) ideal self            2           1.11 0.78 1.06 1.31 1.07 0.97 1.14 0.97 1.56 1.22 1.21 1.24 1.25
# (3) mother                3                0.73 0.53 0.95 0.55 0.81 0.64 0.67 1.58 0.69 0.83 0.77 0.69
# (4) father                4                     0.63 1.15 0.65 0.90 0.83 0.69 1.66 0.84 0.91 0.98 0.89
# (5) kurt                  5                          0.89 0.57 0.79 0.57 0.72 1.51 0.79 0.93 0.87 0.83
# (6) karl                  6                               0.94 0.74 0.66 1.09 0.97 1.17 1.22 1.08 1.15
# (7) george                7                                    0.92 0.65 0.81 1.51 0.73 0.91 0.92 0.78
# (8) martin                8                                         0.68 0.80 1.27 1.09 1.10 1.01 1.07
# (9) elizabeth             9                                              0.87 1.31 1.00 1.13 1.03 0.98
# (10) therapist           10                                                   1.74 0.65 0.63 0.69 0.65
# (11) irene               11                                                        1.83 1.86 1.72 1.84
# (12) childhood self      12                                                             0.43 0.50 0.34
# (13) self before illness 13                                                                  0.43 0.41
# (14) self with delusion  14                                                                       0.45
# (15) self as dreamer     15                                                                           
# 
# Note that Slater distances cannot be compared across grids with a different number of constructs (see Hartmann, 1992).

You can save the results and define the way they are displayed using the print method. For example we could display distances only within certain boundaries, using the cutoff values .8 and 1.2 to indicate very big or small distances as suggested by Norris and Makhlouf-Norris (1976).

d <- distanceSlater(boeker)
print(d, cutoffs = c(.8, 1.2))
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Slater (standardized Euclidean)
# Normalized:
#                                1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
# (1) self                  1           0.75 0.69           0.80           0.59 1.79 0.58 0.55 0.64 0.54
# (2) ideal self            2                0.78      1.31                     1.56 1.22 1.21 1.24 1.25
# (3) mother                3                0.73 0.53      0.55      0.64 0.67 1.58 0.69      0.77 0.69
# (4) father                4                     0.63      0.65           0.69 1.66                    
# (5) kurt                  5                               0.57 0.79 0.57 0.72 1.51 0.79               
# (6) karl                  6                                    0.74 0.66                1.22          
# (7) george                7                                         0.65      1.51 0.73           0.78
# (8) martin                8                                         0.68 0.80 1.27                    
# (9) elizabeth             9                                                   1.31                    
# (10) therapist           10                                                   1.74 0.65 0.63 0.69 0.65
# (11) irene               11                                                        1.83 1.86 1.72 1.84
# (12) childhood self      12                                                             0.43 0.50 0.34
# (13) self before illness 13                                                                  0.43 0.41
# (14) self with delusion  14                                                                       0.45
# (15) self as dreamer     15                                                                           
# 
# Note that Slater distances cannot be compared across grids with a different number of constructs (see Hartmann, 1992).

Calculation

Let GG be the raw grid matrix and DD be the grid matrix centered around the construct means, with dij=g..gijd_{ij} =g_{..} - g_{ij}, where g..g_{..} is the mean of the construct. Further, let

P=DTDandS=trPP=D^TD \qquad \text{and} \qquad S=tr\;P

The Euclidean distances results in:

((dijdik)2)1/2(\sum{ (d_{ij} - d_{ik} )^2})^{1/2}

((dij2+dik22dijdik))1/2\Leftrightarrow (\sum{ (d_{ij}^2 + d_{ik}^2 - 2d_{ij}d_{ik})})^{1/2}

(dij2+dik22dijdik)1/2\Leftrightarrow (\sum{ d_{ij}^2 } + \sum{d_{ik}^2} - 2\sum{d_{ij}d_{ik} })^{1/2}

(Sj+Sk2Pjk)1/2\Leftrightarrow (S_j + S_k - 2P_{jk})^{1/2}

For the standardization, Slater proposes to use the expected Euclidean distance between a random pair of elements taken from the grid. The average for SjS_j and SkS_k would then be Savg=S/mS_{avg} = S/m where mm is the number of elements in the grid. The average of the off-line diagonals of PP is S/m(m1)S/m(m-1)(see Slater, 1951, for a proof). Inserted into the formula above it gives the following expected average euclidean distance UU which is outputted as unit of expected distance in Slater’s INGRID program.

U=(2S/(m1))1/2U = (2S/(m-1))^{1/2}

The calculated euclidean distances are then divided by UU, the unit of expected distance to form the matrix of standardized element distances EstdE_{std}, with

Estd=E/UE_{std} = E/U

Hartmann distances (1992)

Description

Hartmann (1992) showed in a Monte Carlo study that Slater distances (see above) based on random grids, for which Slater coined the expression quasis, have a skewed distribution, a mean and a standard deviation depending on the number of constructs elicited. Hence, the distances cannot be compared across grids with a different number of constructs. As a remedy he suggested a linear transformation (z-transformation) of the Slater distance values which take into account their estimated (or alternatively expected) mean and their standard deviation to standardize them. Hartmann distances represent a more accurate version of Slater distances. Note that Hartmann distances are multiplied by -1 to allow an interpretation similar to correlation coefficients: negative Hartmann values represent an above average dissimilarity (i.e. a big Slater distance) and positive values represent an above average similarity (i.e. a small Slater distance).

The Hartmann distance is calculated as follows (Hartmann, 1992, p. 49).

D=1DslaterMcsdcD = -1 \frac{D_{slater} - M_c}{sd_c}

Where DslaterD_{slater} denotes the Slater distances of the grid, McM_c the sample distribution’s mean value and sdcsd_c the sample distributions’s standard deviation.

R-Code

The function distanceHartmann calculates Hartmann distances. The function can be operated in two ways. The default option (method="paper") uses precalculated mean and standard deviations (as e.g. given in Hartmann (1992)) for the standardization.

distanceHartmann(boeker)
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Hartmann (standardized Slater distances)
# Normalized:
#                                 1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
# (1) self                  1       -0.28  1.58  1.92  0.80 -1.33  1.20 -0.29 -0.04  2.62 -5.24  2.66  2.87  2.28  2.89
# (2) ideal self            2             -0.78  1.36 -0.47 -2.09 -0.56  0.12 -1.02  0.12 -3.69 -1.50 -1.45 -1.63 -1.71
# (3) mother                3                    1.70  2.99  0.22  2.82  1.15  2.27  2.09 -3.84  1.91  1.06  1.44  1.92
# (4) father                4                          2.31 -1.04  2.23  0.55  1.00  1.92 -4.39  0.96  0.50  0.08  0.63
# (5) kurt                  5                                0.63  2.72  1.27  2.69  1.74 -3.37  1.30  0.35  0.79  1.01
# (6) karl                  6                                      0.29  1.63  2.14 -0.66  0.10 -1.21 -1.53 -0.60 -1.04
# (7) george                7                                            0.45  2.19  1.17 -3.39  1.70  0.54  0.42  1.35
# (8) martin                8                                                  2.03  1.22 -1.85 -0.67 -0.73 -0.13 -0.53
# (9) elizabeth             9                                                        0.76 -2.07 -0.08 -0.91 -0.29  0.05
# (10) therapist           10                                                             -4.91  2.20  2.35  1.97  2.22
# (11) irene               11                                                                   -5.47 -5.65 -4.79 -5.52
# (12) childhood self      12                                                                          3.66  3.16  4.22
# (13) self before illness 13                                                                                3.60  3.79
# (14) self with delusion  14                                                                                      3.52
# (15) self as dreamer     15                                                                                          
# 
# For calculation the parameters from Hartmann (1992) were used. Use 'method=new' or method='simulate' for a more accurate version.

The second option (method="simulate") is to simulate the distribution of distances based on the size and scale range of the grid under investigation. A distribution of Slater distances is derived using quasis and used for the Hartmann standardization instead of the precalculated values. The following simulation is based on reps=1000 quasis.

h <- distanceHartmann(boeker, method = "simulate", reps = 1000)
h
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Hartmann (standardized Slater distances)
# Normalized:
#                                 1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
# (1) self                  1       -0.28  1.56  1.90  0.79 -1.32  1.19 -0.29 -0.04  2.59 -5.19  2.63  2.84  2.25  2.86
# (2) ideal self            2             -0.77  1.35 -0.47 -2.07 -0.56  0.12 -1.01  0.12 -3.65 -1.48 -1.43 -1.62 -1.70
# (3) mother                3                    1.68  2.96  0.21  2.79  1.14  2.25  2.07 -3.80  1.89  1.04  1.42  1.90
# (4) father                4                          2.28 -1.04  2.20  0.54  0.99  1.90 -4.35  0.95  0.49  0.08  0.62
# (5) kurt                  5                                0.62  2.69  1.25  2.66  1.72 -3.34  1.28  0.34  0.78  0.99
# (6) karl                  6                                      0.28  1.61  2.11 -0.65  0.09 -1.20 -1.51 -0.59 -1.03
# (7) george                7                                            0.44  2.17  1.15 -3.36  1.68  0.53  0.41  1.33
# (8) martin                8                                                  2.01  1.20 -1.83 -0.67 -0.73 -0.13 -0.52
# (9) elizabeth             9                                                        0.75 -2.05 -0.08 -0.90 -0.28  0.05
# (10) therapist           10                                                             -4.86  2.18  2.32  1.95  2.19
# (11) irene               11                                                                   -5.41 -5.59 -4.74 -5.46
# (12) childhood self      12                                                                          3.62  3.12  4.18
# (13) self before illness 13                                                                                3.56  3.75
# (14) self with delusion  14                                                                                      3.48
# (15) self as dreamer     15

If the results are saved, there are a couple of options for printing the object (see ?print.hdistance).

print(d, p = c(.05, .95))
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Slater (standardized Euclidean)
# Normalized:
#                                1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
# (1) self                  1      1.03 0.75 0.69 0.87 1.19 0.80 1.03 0.99 0.59 1.79 0.58 0.55 0.64 0.54
# (2) ideal self            2           1.11 0.78 1.06 1.31 1.07 0.97 1.14 0.97 1.56 1.22 1.21 1.24 1.25
# (3) mother                3                0.73 0.53 0.95 0.55 0.81 0.64 0.67 1.58 0.69 0.83 0.77 0.69
# (4) father                4                     0.63 1.15 0.65 0.90 0.83 0.69 1.66 0.84 0.91 0.98 0.89
# (5) kurt                  5                          0.89 0.57 0.79 0.57 0.72 1.51 0.79 0.93 0.87 0.83
# (6) karl                  6                               0.94 0.74 0.66 1.09 0.97 1.17 1.22 1.08 1.15
# (7) george                7                                    0.92 0.65 0.81 1.51 0.73 0.91 0.92 0.78
# (8) martin                8                                         0.68 0.80 1.27 1.09 1.10 1.01 1.07
# (9) elizabeth             9                                              0.87 1.31 1.00 1.13 1.03 0.98
# (10) therapist           10                                                   1.74 0.65 0.63 0.69 0.65
# (11) irene               11                                                        1.83 1.86 1.72 1.84
# (12) childhood self      12                                                             0.43 0.50 0.34
# (13) self before illness 13                                                                  0.43 0.41
# (14) self with delusion  14                                                                       0.45
# (15) self as dreamer     15                                                                           
# 
# Note that Slater distances cannot be compared across grids with a different number of constructs (see Hartmann, 1992).

Heckmann’s approach (2012)

Description

Hartmann (1992) suggested a transformation of Slater (1977) distances to make them independent from the size of a grid. Hartmann distances are supposed to yield stable cutoff values used to determine ‘significance’ of inter-element distances. It can be shown that Hartmann distances are still affected by grid parameters like size and the range of the rating scale used (Heckmann, 2012). The function distanceNormalize applies a Box-Cox (1964) transformation to the Hartmann distances in order to remove the skew of the Hartmann distance distribution. The normalized values show to have more stable and nearly symmetric cutoffs (quantiles) and better properties for comparison across grids of different size and scale range.

R-Code

The function distanceNormalize will return Slater, Hartmann or power transformed Hartmann distances (Heckmann, 2012) if prompted. It is also possible to return the quantiles of the sample distribution and only the element distances consideres ‘significant’ according to the quantiles defined.

n <- distanceNormalized(boeker)
n
# 
# ##########################
# Distances between elements
# ##########################
# 
# Distance method:  Power transformed Hartmann distances
# Normalized:
#                                 1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
# (1) self                  1       -0.28  1.56  1.90  0.79 -1.32  1.19 -0.29 -0.04  2.59 -5.18  2.63  2.84  2.25  2.86
# (2) ideal self            2             -0.77  1.35 -0.47 -2.07 -0.56  0.12 -1.01  0.12 -3.65 -1.48 -1.43 -1.62 -1.70
# (3) mother                3                    1.68  2.96  0.21  2.79  1.14  2.25  2.07 -3.80  1.89  1.04  1.42  1.90
# (4) father                4                          2.28 -1.04  2.20  0.54  0.99  1.90 -4.34  0.94  0.49  0.08  0.62
# (5) kurt                  5                                0.62  2.69  1.25  2.66  1.72 -3.34  1.28  0.34  0.78  0.99
# (6) karl                  6                                      0.28  1.61  2.11 -0.65  0.09 -1.19 -1.51 -0.59 -1.03
# (7) george                7                                            0.44  2.17  1.15 -3.36  1.68  0.53  0.41  1.33
# (8) martin                8                                                  2.01  1.20 -1.83 -0.67 -0.73 -0.13 -0.52
# (9) elizabeth             9                                                        0.75 -2.05 -0.08 -0.90 -0.28  0.05
# (10) therapist           10                                                             -4.86  2.18  2.32  1.95  2.19
# (11) irene               11                                                                   -5.41 -5.59 -4.74 -5.46
# (12) childhood self      12                                                                          3.62  3.12  4.17
# (13) self before illness 13                                                                                3.56  3.75
# (14) self with delusion  14                                                                                      3.48
# (15) self as dreamer     15

Calculation

The form of normalization applied by Hartmann (1992) does not account for skewness or kurtosis. Here, a form of normalization - a power transformation - is explored that takes into account these higher moments of the distribution. For this purpose Hartmann values are transformed using the ‘’Box-Cox’’ family of transformations (Box & Cox, 1964). The transformation is defined as

Yiλ={(Yi+c)λ1λfor λ0ln(Yi+c)for λ=0 Y_i^{\lambda}= \left\{ \begin{matrix} \frac{(Y_i + c)^\lambda - 1}{\lambda} & \mbox{for }\lambda \neq 0 \\ ln(Y_i + c) & \mbox{for }\lambda = 0 \end{matrix} \right.

As the transformation requires values 0\ge 0 a constant cc is added to derive positive values only. For the present transformation cc is defined as the minimum Hartmann distances from the quasis distribution. In order to derive at a transformation that resembles the normal distribution as close as possible, an optimal λ\lambda is searched by selecting a λ\lambda that maximizes the correlation between the quantiles of the transformed values YiλY_i^\lambda and the standard normal distribution. As a last step, the power transformed values YiλY_i^\lambda are z-transformed to remove the arbitrary scaling resulting from the Box-Cox transformation yielding YiPY_i^P.

YiP=YiλY¯λσYλY_{i}^P = \frac{Y^{\lambda}_i - \overline Y^{\lambda}}{\sigma_{Y^{\lambda}}}

Literature

Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2), 211–252. Retrieved from http://www.jstor.org/stable/2984418
Hartmann, A. (1992). Element comparisons in repertory grid technique: Results and consequences of a monte carlo study. International Journal of Personal Construct Psychology, 5(1), 41–56. doi:10.1080/08936039208404940
Heckmann, M. (2012, July). Standardizing inter-element distances in grids – a revision of hartmann’s distances. Talk held at the 11th Biennial Conference of the European Personal Construct Association ({EPCA}), Dublin, Irland.
Norris, H., & Makhlouf-Norris, F. (1976). The measurement of self-identity. In P. Slater (Ed.), The measurement of intrapersonal space by grid technique: Explorations of intrapersonal space (Vol. 1, pp. 79–92). London: Wiley & Sons.
Slater, P. (1951). The transformation of a matrix of negative correlations. British Journal of Statistical Psychology, 6, 101–106.
Slater, P. (1977). The measurement of intrapersonal space by grid technique: Dimensions of intrapersonal space (Vol. 2). London: Wiley & Sons.