Module « scipy.stats »
Signature de la fonction cramervonmises_2samp
def cramervonmises_2samp(x, y, method='auto')
Description
cramervonmises_2samp.__doc__
Perform the two-sample Cramér-von Mises test for goodness of fit.
This is the two-sample version of the Cramér-von Mises test ([1]_):
for two independent samples :math:`X_1, ..., X_n` and
:math:`Y_1, ..., Y_m`, the null hypothesis is that the samples
come from the same (unspecified) continuous distribution.
Parameters
----------
x : array_like
A 1-D array of observed values of the random variables :math:`X_i`.
y : array_like
A 1-D array of observed values of the random variables :math:`Y_i`.
method : {'auto', 'asymptotic', 'exact'}, optional
The method used to compute the p-value, see Notes for details.
The default is 'auto'.
Returns
-------
res : object with attributes
statistic : float
Cramér-von Mises statistic.
pvalue : float
The p-value.
See Also
--------
cramervonmises, anderson_ksamp, epps_singleton_2samp, ks_2samp
Notes
-----
.. versionadded:: 1.7.0
The statistic is computed according to equation 9 in [2]_. The
calculation of the p-value depends on the keyword `method`:
- ``asymptotic``: The p-value is approximated by using the limiting
distribution of the test statistic.
- ``exact``: The exact p-value is computed by enumerating all
possible combinations of the test statistic, see [2]_.
The exact calculation will be very slow even for moderate sample
sizes as the number of combinations increases rapidly with the
size of the samples. If ``method=='auto'``, the exact approach
is used if both samples contain less than 10 observations,
otherwise the asymptotic distribution is used.
If the underlying distribution is not continuous, the p-value is likely to
be conservative (Section 6.2 in [3]_). When ranking the data to compute
the test statistic, midranks are used if there are ties.
References
----------
.. [1] https://en.wikipedia.org/wiki/Cramer-von_Mises_criterion
.. [2] Anderson, T.W. (1962). On the distribution of the two-sample
Cramer-von-Mises criterion. The Annals of Mathematical
Statistics, pp. 1148-1159.
.. [3] Conover, W.J., Practical Nonparametric Statistics, 1971.
Examples
--------
Suppose we wish to test whether two samples generated by
``scipy.stats.norm.rvs`` have the same distribution. We choose a
significance level of alpha=0.05.
>>> from scipy import stats
>>> rng = np.random.default_rng()
>>> x = stats.norm.rvs(size=100, random_state=rng)
>>> y = stats.norm.rvs(size=70, random_state=rng)
>>> res = stats.cramervonmises_2samp(x, y)
>>> res.statistic, res.pvalue
(0.29376470588235293, 0.1412873014573014)
The p-value exceeds our chosen significance level, so we do not
reject the null hypothesis that the observed samples are drawn from the
same distribution.
For small sample sizes, one can compute the exact p-values:
>>> x = stats.norm.rvs(size=7, random_state=rng)
>>> y = stats.t.rvs(df=2, size=6, random_state=rng)
>>> res = stats.cramervonmises_2samp(x, y, method='exact')
>>> res.statistic, res.pvalue
(0.197802197802198, 0.31643356643356646)
The p-value based on the asymptotic distribution is a good approximation
even though the sample size is small.
>>> res = stats.cramervonmises_2samp(x, y, method='asymptotic')
>>> res.statistic, res.pvalue
(0.197802197802198, 0.2966041181527128)
Independent of the method, one would not reject the null hypothesis at the
chosen significance level in this example.
Améliorations / Corrections
Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.
Emplacement :
Description des améliorations :