spineplot
Spine Plots and Spinograms
Description
Spine plots are a special cases of mosaic plots, and can be seen as a generalization of stacked (or highlighted) bar plots. Analogously, spinograms are an extension of histograms.
Usage
spineplot(x, ...) ## Default S3 method: spineplot(x, y = NULL, breaks = NULL, tol.ylab = 0.05, off = NULL, ylevels = NULL, col = NULL, main = "", xlab = NULL, ylab = NULL, xaxlabels = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), axes = TRUE, ...) ## S3 method for class 'formula' spineplot(formula, data = NULL, breaks = NULL, tol.ylab = 0.05, off = NULL, ylevels = NULL, col = NULL, main = "", xlab = NULL, ylab = NULL, xaxlabels = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), axes = TRUE, ..., subset = NULL, drop.unused.levels = FALSE)
Arguments
x | an object, the default method expects either a single variable (interpreted to be the explanatory variable) or a 2-way table. See details. |
y | a |
formula | a |
data | an optional data frame. |
breaks | if the explanatory variable is numeric, this controls how it is discretized. |
tol.ylab | convenience tolerance parameter for y-axis annotation. If the distance between two labels drops under this threshold, they are plotted equidistantly. |
off | vertical offset between the bars (in per cent). It is fixed to |
ylevels | a character or numeric vector specifying in which order the levels of the dependent variable should be plotted. |
col | a vector of fill colors of the same length as |
main, xlab, ylab | character strings for annotation |
xaxlabels, yaxlabels | character vectors for annotation of x and y axis. Default to |
xlim, ylim | the range of x and y values with sensible defaults. |
axes | logical. If |
... | additional arguments passed to |
subset | an optional vector specifying a subset of observations to be used for plotting. |
drop.unused.levels | should factors have unused levels dropped? Defaults to |
Details
spineplot
creates either a spinogram or a spine plot. It can be called via spineplot(x, y)
or spineplot(y ~ x)
where y
is interpreted to be the dependent variable (and has to be categorical) and x
the explanatory variable. x
can be either categorical (then a spine plot is created) or numerical (then a spinogram is plotted). Additionally, spineplot
can also be called with only a single argument which then has to be a 2-way table, interpreted to correspond to table(x, y)
.
Both, spine plots and spinograms, are essentially mosaic plots with special formatting of spacing and shading. Conceptually, they plot P(y | x) against P(x). For the spine plot (where both x and y are categorical), both quantities are approximated by the corresponding empirical relative frequencies. For the spinogram (where x is numerical), x is first discretized (by calling hist
with breaks
argument) and then empirical relative frequencies are taken.
Thus, spine plots can also be seen as a generalization of stacked bar plots where not the heights but the widths of the bars corresponds to the relative frequencies of x
. The heights of the bars then correspond to the conditional relative frequencies of y
in every x
group. Analogously, spinograms extend stacked histograms.
Value
The table visualized is returned invisibly.
Author(s)
Achim Zeileis [email protected]
References
Friendly, M. (1994). Mosaic displays for multi-way contingency tables. Journal of the American Statistical Association, 89, 190–200. doi: 10.2307/2291215.
Hartigan, J.A., and Kleiner, B. (1984). A mosaic of television ratings. The American Statistician, 38, 32–35. doi: 10.2307/2683556.
Hofmann, H., Theus, M. (2005), Interactive graphics for visualizing conditional distributions. Unpublished Manuscript.
Hummel, J. (1996). Linked bar charts: Analysing categorical data graphically. Computational Statistics, 11, 23–33.
See Also
Examples
## treatment and improvement of patients with rheumatoid arthritis treatment <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2), labels = c("placebo", "treated")) improved <- factor(rep(c(1, 2, 3, 1, 2, 3), c(29, 7, 7, 13, 7, 21)), levels = c(1, 2, 3), labels = c("none", "some", "marked")) ## (dependence on a categorical variable) (spineplot(improved ~ treatment)) ## applications and admissions by department at UC Berkeley ## (two-way tables) (spineplot(marginSums(UCBAdmissions, c(3, 2)), main = "Applications at UCB")) (spineplot(marginSums(UCBAdmissions, c(3, 1)), main = "Admissions at UCB")) ## NASA space shuttle o-ring failures fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1), levels = c(1, 2), labels = c("no", "yes")) temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70, 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81) ## (dependence on a numerical variable) (spineplot(fail ~ temperature)) (spineplot(fail ~ temperature, breaks = 3)) (spineplot(fail ~ temperature, breaks = quantile(temperature))) ## highlighting for failures spineplot(fail ~ temperature, ylevels = 2:1)
Copyright (©) 1999–2012 R Foundation for Statistical Computing.
Licensed under the GNU General Public License.