plot.partition {cluster} | R Documentation |
Creates plots for visualizing a partition
object.
plot.partition(x, ask = FALSE, which.plots = NULL, nmax.lab = 40, max.strlen = 5, cor = TRUE, stand = FALSE, lines = 2, shade = FALSE, color = FALSE, labels = 0, plotchar = TRUE, span = TRUE, xlim = NULL, ylim = NULL, ...)
x |
an object of class "partition" , typically created by the
functions pam , clara , or fanny . |
ask |
logical; if true and which.plots is NULL ,
plot.partition operates in interactive mode, via menu . |
which.plots |
integer vector or NULL (default), the latter
producing both plots. Otherwise, which.plots must contain
integers of 1 for a clusplot or 2 for
silhouette. |
nmax.lab |
integer indicating the number of labels which is considered too large for single-name labeling the banner plot. |
max.strlen |
positive integer giving the length to which strings are truncated in banner plot labeling. |
cor,stand,lines,shade,color,labels,plotchar,span,xlim,ylim, ... |
All optional arguments available for the clusplot.default
function (except for the diss one) may also be supplied to
this function. Graphical parameters (see par ) may
also be supplied as arguments to this function. |
When ask= TRUE
, rather than producing each plot sequentially,
plot.partition
displays a menu listing all the plots that can be produced.
If the menu is not desired but a pause between plots is still wanted
one must set par(ask= TRUE)
before invoking the plot command.
The clusplot of a cluster partition consists of a two-dimensional representation of the observations, in which the clusters are indicated by ellipses. (See clusplot.partition for more details.)
The silhouette plot of a nonhierarchical clustering is fully described in
Rousseeuw (1987) and in chapter 2 of Kaufman and Rousseeuw (1990).
For each observation i, a bar is drawn, representing the silhouette width s(i)
of the observation. Observations are grouped per cluster, starting with
cluster 1 at the top. Observations with a large s(i) (almost 1) are very well
clustered, a small s(i) (around 0) means that the observation lies between
two clusters, and observations with a negative s(i) are probably placed in
the wrong cluster.
A clustering can be performed for several values of k
(the number of
clusters). Finally, choose the value of k
with the largest overall
average silhouette width.
The silhouette width is computed as follows: Put a(i) = average dissimilarity between i and all other points of the cluster to which i belongs. For all clusters C, put d(i,C) = average dissimilarity of i to all observations of C. The smallest of these d(i,C) is denoted as b(i), and can be seen as the dissimilarity between i and its neighbor cluster. Finally, put s(i) = ( b(i) - a(i) ) / max( a(i), b(i) ). The overall average silhouette width is then simply the average of s(i) over all observations i.
An appropriate plot is produced on the current graphics device. This
can be one or both of the following choices:
Clusplot
Silhouette plot
In the silhouette plot,
observation labels are only printed when the number of observations is
limited less than nmax.lab
(35, by default), for readability.
Moreover, observation labels are truncated to maximally
max.strlen
(5) characters.
Rousseeuw, P.J. (1987) Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math., 20, 5365.
Further, the references in plot.agnes
.
partition.object
, clusplot.partition
,
clusplot.default
, pam
,
pam.object
, clara
,
clara.object
, fanny
,
fanny.object
, par
.
## generate 25 objects, divided into 2 clusters. x <- rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)), cbind(rnorm(15,5,0.5), rnorm(15,5,0.5))) plot(pam(x, 2))