Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Module « scipy.stats »

Fonction ks_1samp - module scipy.stats

Signature de la fonction ks_1samp

def ks_1samp(x, cdf, args=(), alternative='two-sided', mode='auto') 

Description

ks_1samp.__doc__

    Performs the one-sample Kolmogorov-Smirnov test for goodness of fit.

    This test compares the underlying distribution F(x) of a sample
    against a given continuous distribution G(x). See Notes for a description
    of the available null and alternative hypotheses.

    Parameters
    ----------
    x : array_like
        a 1-D array of observations of iid random variables.
    cdf : callable
        callable used to calculate the cdf.
    args : tuple, sequence, optional
        Distribution parameters, used with `cdf`.
    alternative : {'two-sided', 'less', 'greater'}, optional
        Defines the null and alternative hypotheses. Default is 'two-sided'.
        Please see explanations in the Notes below.
    mode : {'auto', 'exact', 'approx', 'asymp'}, optional
        Defines the distribution used for calculating the p-value.
        The following options are available (default is 'auto'):

          * 'auto' : selects one of the other options.
          * 'exact' : uses the exact distribution of test statistic.
          * 'approx' : approximates the two-sided probability with twice
            the one-sided probability
          * 'asymp': uses asymptotic distribution of test statistic

    Returns
    -------
    statistic : float
        KS test statistic, either D, D+ or D- (depending on the value
        of 'alternative')
    pvalue :  float
        One-tailed or two-tailed p-value.

    See Also
    --------
    ks_2samp, kstest

    Notes
    -----
    There are three options for the null and corresponding alternative
    hypothesis that can be selected using the `alternative` parameter.

    - `two-sided`: The null hypothesis is that the two distributions are
      identical, F(x)=G(x) for all x; the alternative is that they are not
      identical.

    - `less`: The null hypothesis is that F(x) >= G(x) for all x; the
      alternative is that F(x) < G(x) for at least one x.

    - `greater`: The null hypothesis is that F(x) <= G(x) for all x; the
      alternative is that F(x) > G(x) for at least one x.

    Note that the alternative hypotheses describe the *CDFs* of the
    underlying distributions, not the observed values. For example,
    suppose x1 ~ F and x2 ~ G. If F(x) > G(x) for all x, the values in
    x1 tend to be less than those in x2.

    Examples
    --------
    >>> from scipy import stats
    >>> rng = np.random.default_rng()

    >>> x = np.linspace(-15, 15, 9)
    >>> stats.ks_1samp(x, stats.norm.cdf)
    (0.44435602715924361, 0.038850142705171065)

    >>> stats.ks_1samp(stats.norm.rvs(size=100, random_state=rng),
    ...                stats.norm.cdf)
    KstestResult(statistic=0.165471391799..., pvalue=0.007331283245...)

    *Test against one-sided alternative hypothesis*

    Shift distribution to larger values, so that `` CDF(x) < norm.cdf(x)``:

    >>> x = stats.norm.rvs(loc=0.2, size=100, random_state=rng)
    >>> stats.ks_1samp(x, stats.norm.cdf, alternative='less')
    KstestResult(statistic=0.100203351482..., pvalue=0.125544644447...)

    Reject null hypothesis in favor of alternative hypothesis: less

    >>> stats.ks_1samp(x, stats.norm.cdf, alternative='greater')
    KstestResult(statistic=0.018749806388..., pvalue=0.920581859791...)

    Reject null hypothesis in favor of alternative hypothesis: greater

    >>> stats.ks_1samp(x, stats.norm.cdf)
    KstestResult(statistic=0.100203351482..., pvalue=0.250616879765...)

    Don't reject null hypothesis in favor of alternative hypothesis: two-sided

    *Testing t distributed random variables against normal distribution*

    With 100 degrees of freedom the t distribution looks close to the normal
    distribution, and the K-S test does not reject the hypothesis that the
    sample came from the normal distribution:

    >>> stats.ks_1samp(stats.t.rvs(100,size=100, random_state=rng),
    ...                stats.norm.cdf)
    KstestResult(statistic=0.064273776544..., pvalue=0.778737758305...)

    With 3 degrees of freedom the t distribution looks sufficiently different
    from the normal distribution, that we can reject the hypothesis that the
    sample came from the normal distribution at the 10% level:

    >>> stats.ks_1samp(stats.t.rvs(3,size=100, random_state=rng),
    ...                stats.norm.cdf)
    KstestResult(statistic=0.128678487493..., pvalue=0.066569081515...)