Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Module « scipy.special »

Fonction smirnov - module scipy.special

Signature de la fonction smirnov

Description

smirnov.__doc__

smirnov(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

smirnov(n, d)

Kolmogorov-Smirnov complementary cumulative distribution function

Returns the exact Kolmogorov-Smirnov complementary cumulative
distribution function,(aka the Survival Function) of Dn+ (or Dn-)
for a one-sided test of equality between an empirical and a
theoretical distribution. It is equal to the probability that the
maximum difference between a theoretical distribution and an empirical
one based on `n` samples is greater than d.

Parameters
----------
n : int
  Number of samples
d : float array_like
  Deviation between the Empirical CDF (ECDF) and the target CDF.

Returns
-------
float
    The value(s) of smirnov(n, d), Prob(Dn+ >= d) (Also Prob(Dn- >= d))

Notes
-----
`smirnov` is used by `stats.kstest` in the application of the
Kolmogorov-Smirnov Goodness of Fit test. For historial reasons this
function is exposed in `scpy.special`, but the recommended way to achieve
the most accurate CDF/SF/PDF/PPF/ISF computations is to use the
`stats.ksone` distribution.

See Also
--------
smirnovi : The Inverse Survival Function for the distribution
scipy.stats.ksone : Provides the functionality as a continuous distribution
kolmogorov, kolmogi : Functions for the two-sided distribution

Examples
--------
>>> from scipy.special import smirnov

Show the probability of a gap at least as big as 0, 0.5 and 1.0 for a sample of size 5

>>> smirnov(5, [0, 0.5, 1.0])
array([ 1.   ,  0.056,  0.   ])

Compare a sample of size 5 drawn from a source N(0.5, 1) distribution against
a target N(0, 1) CDF.

>>> from scipy.stats import norm
>>> rng = np.random.default_rng()
>>> n = 5
>>> gendist = norm(0.5, 1)       # Normal distribution, mean 0.5, stddev 1
>>> x = np.sort(gendist.rvs(size=n, random_state=rng))
>>> x
array([-1.3922078 , -0.13526532,  0.1371477 ,  0.18981686,  1.81948167])
>>> target = norm(0, 1)
>>> cdfs = target.cdf(x)
>>> cdfs
array([0.08192974, 0.44620105, 0.55454297, 0.57527368, 0.96558101])
# Construct the Empirical CDF and the K-S statistics (Dn+, Dn-, Dn)
>>> ecdfs = np.arange(n+1, dtype=float)/n
>>> cols = np.column_stack([x, ecdfs[1:], cdfs, cdfs - ecdfs[:n], ecdfs[1:] - cdfs])
>>> np.set_printoptions(precision=3)
>>> cols
array([[-1.392,  0.2  ,  0.082,  0.082,  0.118],
       [-0.135,  0.4  ,  0.446,  0.246, -0.046],
       [ 0.137,  0.6  ,  0.555,  0.155,  0.045],
       [ 0.19 ,  0.8  ,  0.575, -0.025,  0.225],
       [ 1.819,  1.   ,  0.966,  0.166,  0.034]])
>>> gaps = cols[:, -2:]
>>> Dnpm = np.max(gaps, axis=0)
>>> print('Dn-=%f, Dn+=%f' % (Dnpm[0], Dnpm[1]))
Dn-=0.246201, Dn+=0.224726
>>> probs = smirnov(n, Dnpm)
>>> print(chr(10).join(['For a sample of size %d drawn from a N(0, 1) distribution:' % n,
...      ' Smirnov n=%d: Prob(Dn- >= %f) = %.4f' % (n, Dnpm[0], probs[0]),
...      ' Smirnov n=%d: Prob(Dn+ >= %f) = %.4f' % (n, Dnpm[1], probs[1])]))
For a sample of size 5 drawn from a N(0, 1) distribution:
 Smirnov n=5: Prob(Dn- >= 0.246201) = 0.4713
 Smirnov n=5: Prob(Dn+ >= 0.224726) = 0.5243

Plot the Empirical CDF against the target N(0, 1) CDF

>>> import matplotlib.pyplot as plt
>>> plt.step(np.concatenate([[-3], x]), ecdfs, where='post', label='Empirical CDF')
>>> x3 = np.linspace(-3, 3, 100)
>>> plt.plot(x3, target.cdf(x3), label='CDF for N(0, 1)')
>>> plt.ylim([0, 1]); plt.grid(True); plt.legend();
# Add vertical lines marking Dn+ and Dn-
>>> iminus, iplus = np.argmax(gaps, axis=0)
>>> plt.vlines([x[iminus]], ecdfs[iminus], cdfs[iminus], color='r', linestyle='dashed', lw=4)
>>> plt.vlines([x[iplus]], cdfs[iplus], ecdfs[iplus+1], color='m', linestyle='dashed', lw=4)
>>> plt.show()