Vous êtes un professionnel et vous avez besoin d'une formation ?
Sensibilisation àl'Intelligence Artificielle
Voir le programme détaillé
Module « scipy.stats »
Signature de la fonction spearmanr
def spearmanr(a, b=None, axis=0, nan_policy='propagate', alternative='two-sided')
Description
help(scipy.stats.spearmanr)
Calculate a Spearman correlation coefficient with associated p-value.
The Spearman rank-order correlation coefficient is a nonparametric measure
of the monotonicity of the relationship between two datasets.
Like other correlation coefficients,
this one varies between -1 and +1 with 0 implying no correlation.
Correlations of -1 or +1 imply an exact monotonic relationship. Positive
correlations imply that as x increases, so does y. Negative correlations
imply that as x increases, y decreases.
The p-value roughly indicates the probability of an uncorrelated system
producing datasets that have a Spearman correlation at least as extreme
as the one computed from these datasets. Although calculation of the
p-value does not make strong assumptions about the distributions underlying
the samples, it is only accurate for very large samples (>500
observations). For smaller sample sizes, consider a permutation test (see
Examples section below).
Parameters
----------
a, b : 1D or 2D array_like, b is optional
One or two 1-D or 2-D arrays containing multiple variables and
observations. When these are 1-D, each represents a vector of
observations of a single variable. For the behavior in the 2-D case,
see under ``axis``, below.
Both arrays need to have the same length in the ``axis`` dimension.
axis : int or None, optional
If axis=0 (default), then each column represents a variable, with
observations in the rows. If axis=1, the relationship is transposed:
each row represents a variable, while the columns contain observations.
If axis=None, then both arrays will be raveled.
nan_policy : {'propagate', 'raise', 'omit'}, optional
Defines how to handle when input contains nan.
The following options are available (default is 'propagate'):
* 'propagate': returns nan
* 'raise': throws an error
* 'omit': performs the calculations ignoring nan values
alternative : {'two-sided', 'less', 'greater'}, optional
Defines the alternative hypothesis. Default is 'two-sided'.
The following options are available:
* 'two-sided': the correlation is nonzero
* 'less': the correlation is negative (less than zero)
* 'greater': the correlation is positive (greater than zero)
.. versionadded:: 1.7.0
Returns
-------
res : SignificanceResult
An object containing attributes:
statistic : float or ndarray (2-D square)
Spearman correlation matrix or correlation coefficient (if only 2
variables are given as parameters). Correlation matrix is square
with length equal to total number of variables (columns or rows) in
``a`` and ``b`` combined.
pvalue : float
The p-value for a hypothesis test whose null hypothesis
is that two samples have no ordinal correlation. See
`alternative` above for alternative hypotheses. `pvalue` has the
same shape as `statistic`.
Raises
------
ValueError
If `axis` is not 0, 1 or None, or if the number of dimensions of `a`
is greater than 2, or if `b` is None and the number of dimensions of
`a` is less than 2.
Warns
-----
`~scipy.stats.ConstantInputWarning`
Raised if an input is a constant array. The correlation coefficient
is not defined in this case, so ``np.nan`` is returned.
See Also
--------
:ref:`hypothesis_spearmanr` : Extended example
References
----------
.. [1] Zwillinger, D. and Kokoska, S. (2000). CRC Standard
Probability and Statistics Tables and Formulae. Chapman & Hall: New
York. 2000.
Section 14.7
.. [2] Kendall, M. G. and Stuart, A. (1973).
The Advanced Theory of Statistics, Volume 2: Inference and Relationship.
Griffin. 1973.
Section 31.18
Examples
--------
>>> import numpy as np
>>> from scipy import stats
>>> res = stats.spearmanr([1, 2, 3, 4, 5], [5, 6, 7, 8, 7])
>>> res.statistic
0.8207826816681233
>>> res.pvalue
0.08858700531354381
>>> rng = np.random.default_rng()
>>> x2n = rng.standard_normal((100, 2))
>>> y2n = rng.standard_normal((100, 2))
>>> res = stats.spearmanr(x2n)
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)
>>> res = stats.spearmanr(x2n[:, 0], x2n[:, 1])
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)
>>> res = stats.spearmanr(x2n, y2n)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
[-0.07960396, 1. , -0.14448245, 0.16738074],
[-0.08314431, -0.14448245, 1. , 0.03234323],
[ 0.09662166, 0.16738074, 0.03234323, 1. ]])
>>> res.pvalue
array([[0. , 0.43111687, 0.41084066, 0.33891628],
[0.43111687, 0. , 0.15151618, 0.09600687],
[0.41084066, 0.15151618, 0. , 0.74938561],
[0.33891628, 0.09600687, 0.74938561, 0. ]])
>>> res = stats.spearmanr(x2n.T, y2n.T, axis=1)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
[-0.07960396, 1. , -0.14448245, 0.16738074],
[-0.08314431, -0.14448245, 1. , 0.03234323],
[ 0.09662166, 0.16738074, 0.03234323, 1. ]])
>>> res = stats.spearmanr(x2n, y2n, axis=None)
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)
>>> res = stats.spearmanr(x2n.ravel(), y2n.ravel())
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)
>>> rng = np.random.default_rng()
>>> xint = rng.integers(10, size=(100, 2))
>>> res = stats.spearmanr(xint)
>>> res.statistic, res.pvalue
(0.09800224850707953, 0.3320271757932076)
For small samples, consider performing a permutation test instead of
relying on the asymptotic p-value. Note that to calculate the null
distribution of the statistic (for all possibly pairings between
observations in sample ``x`` and ``y``), only one of the two inputs needs
to be permuted.
>>> x = [1.76405235, 0.40015721, 0.97873798,
... 2.2408932, 1.86755799, -0.97727788]
>>> y = [2.71414076, 0.2488, 0.87551913,
... 2.6514917, 2.01160156, 0.47699563]
>>> def statistic(x): # permute only `x`
... return stats.spearmanr(x, y).statistic
>>> res_exact = stats.permutation_test((x,), statistic,
... permutation_type='pairings')
>>> res_asymptotic = stats.spearmanr(x, y)
>>> res_exact.pvalue, res_asymptotic.pvalue # asymptotic pvalue is too low
(0.10277777777777777, 0.07239650145772594)
For a more detailed example, see :ref:`hypothesis_spearmanr`.
Vous êtes un professionnel et vous avez besoin d'une formation ?
Machine Learning
avec Scikit-Learn
Voir le programme détaillé
Améliorations / Corrections
Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.
Emplacement :
Description des améliorations :