Fonction spearmanr - module scipy.stats

Signature de la fonction spearmanr

def spearmanr(a, b=None, axis=0, nan_policy='propagate', alternative='two-sided')

Description

help(scipy.stats.spearmanr)

Calculate a Spearman correlation coefficient with associated p-value.

The Spearman rank-order correlation coefficient is a nonparametric measure
of the monotonicity of the relationship between two datasets.
Like other correlation coefficients,
this one varies between -1 and +1 with 0 implying no correlation.
Correlations of -1 or +1 imply an exact monotonic relationship. Positive
correlations imply that as x increases, so does y. Negative correlations
imply that as x increases, y decreases.

The p-value roughly indicates the probability of an uncorrelated system
producing datasets that have a Spearman correlation at least as extreme
as the one computed from these datasets. Although calculation of the
p-value does not make strong assumptions about the distributions underlying
the samples, it is only accurate for very large samples (>500
observations). For smaller sample sizes, consider a permutation test (see
Examples section below).

Parameters
----------
a, b : 1D or 2D array_like, b is optional
    One or two 1-D or 2-D arrays containing multiple variables and
    observations. When these are 1-D, each represents a vector of
    observations of a single variable. For the behavior in the 2-D case,
    see under ``axis``, below.
    Both arrays need to have the same length in the ``axis`` dimension.
axis : int or None, optional
    If axis=0 (default), then each column represents a variable, with
    observations in the rows. If axis=1, the relationship is transposed:
    each row represents a variable, while the columns contain observations.
    If axis=None, then both arrays will be raveled.
nan_policy : {'propagate', 'raise', 'omit'}, optional
    Defines how to handle when input contains nan.
    The following options are available (default is 'propagate'):

    * 'propagate': returns nan
    * 'raise': throws an error
    * 'omit': performs the calculations ignoring nan values

alternative : {'two-sided', 'less', 'greater'}, optional
    Defines the alternative hypothesis. Default is 'two-sided'.
    The following options are available:

    * 'two-sided': the correlation is nonzero
    * 'less': the correlation is negative (less than zero)
    * 'greater':  the correlation is positive (greater than zero)

    .. versionadded:: 1.7.0

Returns
-------
res : SignificanceResult
    An object containing attributes:

    statistic : float or ndarray (2-D square)
        Spearman correlation matrix or correlation coefficient (if only 2
        variables are given as parameters). Correlation matrix is square
        with length equal to total number of variables (columns or rows) in
        ``a`` and ``b`` combined.
    pvalue : float
        The p-value for a hypothesis test whose null hypothesis
        is that two samples have no ordinal correlation. See
        `alternative` above for alternative hypotheses. `pvalue` has the
        same shape as `statistic`.

Raises
------
ValueError
    If `axis` is not 0, 1 or None, or if the number of dimensions of `a`
    is greater than 2, or if `b` is None and the number of dimensions of
    `a` is less than 2.

Warns
-----
`~scipy.stats.ConstantInputWarning`
    Raised if an input is a constant array.  The correlation coefficient
    is not defined in this case, so ``np.nan`` is returned.

See Also
--------
:ref:`hypothesis_spearmanr` : Extended example

References
----------
.. [1] Zwillinger, D. and Kokoska, S. (2000). CRC Standard
   Probability and Statistics Tables and Formulae. Chapman & Hall: New
   York. 2000.
   Section  14.7
.. [2] Kendall, M. G. and Stuart, A. (1973).
   The Advanced Theory of Statistics, Volume 2: Inference and Relationship.
   Griffin. 1973.
   Section 31.18

Examples
--------

>>> import numpy as np
>>> from scipy import stats
>>> res = stats.spearmanr([1, 2, 3, 4, 5], [5, 6, 7, 8, 7])
>>> res.statistic
0.8207826816681233
>>> res.pvalue
0.08858700531354381

>>> rng = np.random.default_rng()
>>> x2n = rng.standard_normal((100, 2))
>>> y2n = rng.standard_normal((100, 2))
>>> res = stats.spearmanr(x2n)
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)

>>> res = stats.spearmanr(x2n[:, 0], x2n[:, 1])
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)

>>> res = stats.spearmanr(x2n, y2n)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
       [-0.07960396, 1. , -0.14448245, 0.16738074],
       [-0.08314431, -0.14448245, 1. , 0.03234323],
       [ 0.09662166, 0.16738074, 0.03234323, 1. ]])
>>> res.pvalue
array([[0. , 0.43111687, 0.41084066, 0.33891628],
       [0.43111687, 0. , 0.15151618, 0.09600687],
       [0.41084066, 0.15151618, 0. , 0.74938561],
       [0.33891628, 0.09600687, 0.74938561, 0. ]])

>>> res = stats.spearmanr(x2n.T, y2n.T, axis=1)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
       [-0.07960396, 1. , -0.14448245, 0.16738074],
       [-0.08314431, -0.14448245, 1. , 0.03234323],
       [ 0.09662166, 0.16738074, 0.03234323, 1. ]])

>>> res = stats.spearmanr(x2n, y2n, axis=None)
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)

>>> res = stats.spearmanr(x2n.ravel(), y2n.ravel())
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)

>>> rng = np.random.default_rng()
>>> xint = rng.integers(10, size=(100, 2))
>>> res = stats.spearmanr(xint)
>>> res.statistic, res.pvalue
(0.09800224850707953, 0.3320271757932076)

For small samples, consider performing a permutation test instead of
relying on the asymptotic p-value. Note that to calculate the null
distribution of the statistic (for all possibly pairings between
observations in sample ``x`` and ``y``), only one of the two inputs needs
to be permuted.

>>> x = [1.76405235, 0.40015721, 0.97873798,
... 2.2408932, 1.86755799, -0.97727788]
>>> y = [2.71414076, 0.2488, 0.87551913,
... 2.6514917, 2.01160156, 0.47699563]

>>> def statistic(x): # permute only `x`
...     return stats.spearmanr(x, y).statistic
>>> res_exact = stats.permutation_test((x,), statistic,
...     permutation_type='pairings')
>>> res_asymptotic = stats.spearmanr(x, y)
>>> res_exact.pvalue, res_asymptotic.pvalue # asymptotic pvalue is too low
(0.10277777777777777, 0.07239650145772594)

For a more detailed example, see :ref:`hypothesis_spearmanr`.

Vous êtes un professionnel et vous avez besoin d'une formation ? Machine Learning
avec Scikit-Learn Voir le programme détaillé

Le tutoriel Python complet (Text+Vidéos)

Le tutoriel Python en vidéos

Evaluez vos compétences en Python

Améliorations / Corrections

Fonction spearmanr - module scipy.stats

Signature de la fonction spearmanr

Description

help(scipy.stats.spearmanr)