princomp
Principal Components Analysis
Description
princomp
performs a principal components analysis on the given numeric data matrix and returns the results as an object of class princomp
.
Usage
princomp(x, ...) ## S3 method for class 'formula' princomp(formula, data = NULL, subset, na.action, ...) ## Default S3 method: princomp(x, cor = FALSE, scores = TRUE, covmat = NULL, subset = rep_len(TRUE, nrow(as.matrix(x))), fix_sign = TRUE, ...) ## S3 method for class 'princomp' predict(object, newdata, ...)
Arguments
formula | a formula with no response variable, referring only to numeric variables. |
data | an optional data frame (or similar: see |
subset | an optional vector used to select rows (observations) of the data matrix |
na.action | a function which indicates what should happen when the data contain |
x | a numeric matrix or data frame which provides the data for the principal components analysis. |
cor | a logical value indicating whether the calculation should use the correlation matrix or the covariance matrix. (The correlation matrix can only be used if there are no constant variables.) |
scores | a logical value indicating whether the score on each principal component should be calculated. |
covmat | a covariance matrix, or a covariance list as returned by |
fix_sign | Should the signs of the loadings and scores be chosen so that the first element of each loading is non-negative? |
... | arguments passed to or from other methods. If |
object | Object of class inheriting from |
newdata | An optional data frame or matrix in which to look for variables with which to predict. If omitted, the scores are used. If the original fit used a formula or a data frame or a matrix with column names, |
Details
princomp
is a generic function with "formula"
and "default"
methods.
The calculation is done using eigen
on the correlation or covariance matrix, as determined by cor
. This is done for compatibility with the S-PLUS result. A preferred method of calculation is to use svd
on x
, as is done in prcomp
.
Note that the default calculation uses divisor N
for the covariance matrix.
The print
method for these objects prints the results in a nice format and the plot
method produces a scree plot (screeplot
). There is also a biplot
method.
If x
is a formula then the standard NA-handling is applied to the scores (if requested): see napredict
.
princomp
only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use prcomp
.
Value
princomp
returns a list with class "princomp"
containing the following components:
sdev | the standard deviations of the principal components. |
loadings | the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). This is of class |
center | the means that were subtracted. |
scale | the scalings applied to each variable. |
n.obs | the number of observations. |
scores | if |
call | the matched call. |
na.action | If relevant. |
Note
The signs of the columns of the loadings and scores are arbitrary, and so may differ between different programs for PCA, and even between different builds of R: fix_sign = TRUE
alleviates that.
References
Mardia, K. V., J. T. Kent and J. M. Bibby (1979). Multivariate Analysis, London: Academic Press.
Venables, W. N. and B. D. Ripley (2002). Modern Applied Statistics with S, Springer-Verlag.
See Also
summary.princomp
, screeplot
, biplot.princomp
, prcomp
, cor
, cov
, eigen
.
Examples
require(graphics) ## The variances of the variables in the ## USArrests data vary by orders of magnitude, so scaling is appropriate (pc.cr <- princomp(USArrests)) # inappropriate princomp(USArrests, cor = TRUE) # =^= prcomp(USArrests, scale=TRUE) ## Similar, but different: ## The standard deviations differ by a factor of sqrt(49/50) summary(pc.cr <- princomp(USArrests, cor = TRUE)) loadings(pc.cr) # note that blank entries are small but not zero ## The signs of the columns of the loadings are arbitrary plot(pc.cr) # shows a screeplot. biplot(pc.cr) ## Formula interface princomp(~ ., data = USArrests, cor = TRUE) ## NA-handling USArrests[1, 2] <- NA pc.cr <- princomp(~ Murder + Assault + UrbanPop, data = USArrests, na.action = na.exclude, cor = TRUE) pc.cr$scores[1:5, ] ## (Simple) Robust PCA: ## Classical: (pc.cl <- princomp(stackloss)) ## Robust: (pc.rob <- princomp(stackloss, covmat = MASS::cov.rob(stackloss)))
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.