Module « scipy.cluster.hierarchy »
Signature de la fonction cophenet
def cophenet(Z, Y=None)
Calculate the cophenetic distances between each observation in
the hierarchical clustering defined by the linkage ``Z``.
Suppose ``p`` and ``q`` are original observations in
disjoint clusters ``s`` and ``t``, respectively and
``s`` and ``t`` are joined by a direct parent cluster
``u``. The cophenetic distance between observations
``i`` and ``j`` is simply the distance between
clusters ``s`` and ``t``.
Z : ndarray
The hierarchical clustering encoded as an array
(see `linkage` function).
Y : ndarray (optional)
Calculates the cophenetic correlation coefficient ``c`` of a
hierarchical clustering defined by the linkage matrix `Z`
of a set of :math:`n` observations in :math:`m`
dimensions. `Y` is the condensed distance matrix from which
`Z` was generated.
c : ndarray
The cophentic correlation distance (if ``Y`` is passed).
d : ndarray
The cophenetic distance matrix in condensed form. The
:math:`ij` th entry is the cophenetic distance between
original observations :math:`i` and :math:`j`.
See Also
linkage : for a description of what a linkage matrix is.
scipy.spatial.distance.squareform : transforming condensed matrices into square ones.
>>> from scipy.cluster.hierarchy import single, cophenet
>>> from scipy.spatial.distance import pdist, squareform
Given a dataset ``X`` and a linkage matrix ``Z``, the cophenetic distance
between two points of ``X`` is the distance between the largest two
distinct clusters that each of the points:
>>> X = [[0, 0], [0, 1], [1, 0],
... [0, 4], [0, 3], [1, 4],
... [4, 0], [3, 0], [4, 1],
... [4, 4], [3, 4], [4, 3]]
``X`` corresponds to this dataset ::
x x x x
x x
x x
x x x x
>>> Z = single(pdist(X))
>>> Z
array([[ 0., 1., 1., 2.],
[ 2., 12., 1., 3.],
[ 3., 4., 1., 2.],
[ 5., 14., 1., 3.],
[ 6., 7., 1., 2.],
[ 8., 16., 1., 3.],
[ 9., 10., 1., 2.],
[11., 18., 1., 3.],
[13., 15., 2., 6.],
[17., 20., 2., 9.],
[19., 21., 2., 12.]])
>>> cophenet(Z)
array([1., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 2., 2., 2., 2., 2.,
2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 1., 2., 2.,
2., 2., 2., 2., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.,
1., 1., 2., 2., 2., 1., 2., 2., 2., 2., 2., 2., 1., 1., 1.])
The output of the `scipy.cluster.hierarchy.cophenet` method is
represented in condensed form. We can use
`scipy.spatial.distance.squareform` to see the output as a
regular matrix (where each element ``ij`` denotes the cophenetic distance
between each ``i``, ``j`` pair of points in ``X``):
>>> squareform(cophenet(Z))
array([[0., 1., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[1., 0., 1., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[1., 1., 0., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[2., 2., 2., 0., 1., 1., 2., 2., 2., 2., 2., 2.],
[2., 2., 2., 1., 0., 1., 2., 2., 2., 2., 2., 2.],
[2., 2., 2., 1., 1., 0., 2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2., 0., 1., 1., 2., 2., 2.],
[2., 2., 2., 2., 2., 2., 1., 0., 1., 2., 2., 2.],
[2., 2., 2., 2., 2., 2., 1., 1., 0., 2., 2., 2.],
[2., 2., 2., 2., 2., 2., 2., 2., 2., 0., 1., 1.],
[2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 0., 1.],
[2., 2., 2., 2., 2., 2., 2., 2., 2., 1., 1., 0.]])
In this example, the cophenetic distance between points on ``X`` that are
very close (i.e., in the same corner) is 1. For other pairs of points is 2,
because the points will be located in clusters at different
corners - thus, the distance between these clusters will be larger.
Améliorations / Corrections
Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.
Emplacement :
Description des améliorations :