cmdscale
Classical (Metric) Multidimensional Scaling
Description
Classical multidimensional scaling (MDS) of a data matrix. Also known as principal coordinates analysis (Gower, 1966).
Usage
cmdscale(d, k = 2, eig = FALSE, add = FALSE, x.ret = FALSE, list. = eig || add || x.ret)
Arguments
d | a distance structure such as that returned by |
k | the maximum dimension of the space which the data are to be represented in; must be in {1, 2, …, n-1}. |
eig | indicates whether eigenvalues should be returned. |
add | logical indicating if an additive constant c* should be computed, and added to the non-diagonal dissimilarities such that the modified dissimilarities are Euclidean. |
x.ret | indicates whether the doubly centred symmetric distance matrix should be returned. |
list. | logical indicating if a |
Details
Multidimensional scaling takes a set of dissimilarities and returns a set of points such that the distances between the points are approximately equal to the dissimilarities. (It is a major part of what ecologists call ‘ordination’.)
A set of Euclidean distances on n points can be represented exactly in at most n - 1 dimensions. cmdscale
follows the analysis of Mardia (1978), and returns the best-fitting k-dimensional representation, where k may be less than the argument k
.
The representation is only determined up to location (cmdscale
takes the column means of the configuration to be at the origin), rotations and reflections. The configuration returned is given in principal-component axes, so the reflection chosen may differ between R platforms (see prcomp
).
When add = TRUE
, a minimal additive constant c* is computed such that the dissimilarities d[i,j] + c* are Euclidean and hence can be represented in n - 1
dimensions. Whereas S (Becker et al, 1988) computes this constant using an approximation suggested by Torgerson, R uses the analytical solution of Cailliez (1983), see also Cox and Cox (2001). Note that because of numerical errors the computed eigenvalues need not all be non-negative, and even theoretically the representation could be in fewer than n - 1
dimensions.
Value
If .list
is false (as per default), a matrix with k
columns whose rows give the coordinates of the points chosen to represent the dissimilarities.
Otherwise, a list
containing the following components.
points | a matrix with up to |
eig | the n eigenvalues computed during the scaling process if |
x | the doubly centered distance matrix if |
ac | the additive constant c*, |
GOF | a numeric vector of length 2, equal to say (g.1,g.2), where g.i = (sum{j=1..k} λ[j]) / (sum{j=1..n} T.i(λ[j])), where λ[j] are the eigenvalues (sorted in decreasing order), T.1(v) = abs(v), and T.2(v) = max(v, 0). |
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Wadsworth & Brooks/Cole.
Cailliez, F. (1983). The analytical solution of the additive constant problem. Psychometrika, 48, 343–349. doi: 10.1007/BF02294026.
Cox, T. F. and Cox, M. A. A. (2001). Multidimensional Scaling. Second edition. Chapman and Hall.
Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53, 325–328. doi: 10.2307/2333639.
Krzanowski, W. J. and Marriott, F. H. C. (1994). Multivariate Analysis. Part I. Distributions, Ordination and Inference. London: Edward Arnold. (Especially pp. 108–111.)
Mardia, K.V. (1978). Some properties of classical multidimensional scaling. Communications on Statistics – Theory and Methods, A7, 1233–41. doi: 10.1080/03610927808827707
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Chapter 14 of Multivariate Analysis, London: Academic Press.
Seber, G. A. F. (1984). Multivariate Observations. New York: Wiley.
Torgerson, W. S. (1958). Theory and Methods of Scaling. New York: Wiley.
See Also
dist
.
isoMDS
and sammon
in package MASS provide alternative methods of multidimensional scaling.
Examples
require(graphics) loc <- cmdscale(eurodist) x <- loc[, 1] y <- -loc[, 2] # reflect so North is at the top ## note asp = 1, to ensure Euclidean distances are represented correctly plot(x, y, type = "n", xlab = "", ylab = "", asp = 1, axes = FALSE, main = "cmdscale(eurodist)") text(x, y, rownames(loc), cex = 0.6)
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.