Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Module « scipy.stats »

Fonction cramervonmises_2samp - module scipy.stats

Signature de la fonction cramervonmises_2samp

def cramervonmises_2samp(x, y, method='auto') 

Description

cramervonmises_2samp.__doc__

Perform the two-sample Cramér-von Mises test for goodness of fit.

    This is the two-sample version of the Cramér-von Mises test ([1]_):
    for two independent samples :math:`X_1, ..., X_n` and
    :math:`Y_1, ..., Y_m`, the null hypothesis is that the samples
    come from the same (unspecified) continuous distribution.

    Parameters
    ----------
    x : array_like
        A 1-D array of observed values of the random variables :math:`X_i`.
    y : array_like
        A 1-D array of observed values of the random variables :math:`Y_i`.
    method : {'auto', 'asymptotic', 'exact'}, optional
        The method used to compute the p-value, see Notes for details.
        The default is 'auto'.

    Returns
    -------
    res : object with attributes
        statistic : float
            Cramér-von Mises statistic.
        pvalue : float
            The p-value.

    See Also
    --------
    cramervonmises, anderson_ksamp, epps_singleton_2samp, ks_2samp

    Notes
    -----
    .. versionadded:: 1.7.0

    The statistic is computed according to equation 9 in [2]_. The
    calculation of the p-value depends on the keyword `method`:

    - ``asymptotic``: The p-value is approximated by using the limiting
      distribution of the test statistic.
    - ``exact``: The exact p-value is computed by enumerating all
      possible combinations of the test statistic, see [2]_.

    The exact calculation will be very slow even for moderate sample
    sizes as the number of combinations increases rapidly with the
    size of the samples. If ``method=='auto'``, the exact approach
    is used if both samples contain less than 10 observations,
    otherwise the asymptotic distribution is used.

    If the underlying distribution is not continuous, the p-value is likely to
    be conservative (Section 6.2 in [3]_). When ranking the data to compute
    the test statistic, midranks are used if there are ties.

    References
    ----------
    .. [1] https://en.wikipedia.org/wiki/Cramer-von_Mises_criterion
    .. [2] Anderson, T.W. (1962). On the distribution of the two-sample
           Cramer-von-Mises criterion. The Annals of Mathematical
           Statistics, pp. 1148-1159.
    .. [3] Conover, W.J., Practical Nonparametric Statistics, 1971.

    Examples
    --------

    Suppose we wish to test whether two samples generated by
    ``scipy.stats.norm.rvs`` have the same distribution. We choose a
    significance level of alpha=0.05.

    >>> from scipy import stats
    >>> rng = np.random.default_rng()
    >>> x = stats.norm.rvs(size=100, random_state=rng)
    >>> y = stats.norm.rvs(size=70, random_state=rng)
    >>> res = stats.cramervonmises_2samp(x, y)
    >>> res.statistic, res.pvalue
    (0.29376470588235293, 0.1412873014573014)

    The p-value exceeds our chosen significance level, so we do not
    reject the null hypothesis that the observed samples are drawn from the
    same distribution.

    For small sample sizes, one can compute the exact p-values:

    >>> x = stats.norm.rvs(size=7, random_state=rng)
    >>> y = stats.t.rvs(df=2, size=6, random_state=rng)
    >>> res = stats.cramervonmises_2samp(x, y, method='exact')
    >>> res.statistic, res.pvalue
    (0.197802197802198, 0.31643356643356646)

    The p-value based on the asymptotic distribution is a good approximation
    even though the sample size is small.

    >>> res = stats.cramervonmises_2samp(x, y, method='asymptotic')
    >>> res.statistic, res.pvalue
    (0.197802197802198, 0.2966041181527128)

    Independent of the method, one would not reject the null hypothesis at the
    chosen significance level in this example.