Fonction kde - module statistics

Signature de la fonction kde

def kde(data, h, kernel='normal', *, cumulative=False)

Description

help(statistics.kde)

Kernel Density Estimation:  Create a continuous probability density
function or cumulative distribution function from discrete samples.

The basic idea is to smooth the data using a kernel function
to help draw inferences about a population from a sample.

The degree of smoothing is controlled by the scaling parameter h
which is called the bandwidth.  Smaller values emphasize local
features while larger values give smoother results.

The kernel determines the relative weights of the sample data
points.  Generally, the choice of kernel shape does not matter
as much as the more influential bandwidth smoothing parameter.

Kernels that give some weight to every sample point:

   normal (gauss)
   logistic
   sigmoid

Kernels that only give weight to sample points within
the bandwidth:

   rectangular (uniform)
   triangular
   parabolic (epanechnikov)
   quartic (biweight)
   triweight
   cosine

If *cumulative* is true, will return a cumulative distribution function.

A StatisticsError will be raised if the data sequence is empty.

Example
-------

Given a sample of six data points, construct a continuous
function that estimates the underlying probability density:

    >>> sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]
    >>> f_hat = kde(sample, h=1.5)

Compute the area under the curve:

    >>> area = sum(f_hat(x) for x in range(-20, 20))
    >>> round(area, 4)
    1.0

Plot the estimated probability density function at
evenly spaced points from -6 to 10:

    >>> for x in range(-6, 11):
    ...     density = f_hat(x)
    ...     plot = ' ' * int(density * 400) + 'x'
    ...     print(f'{x:2}: {density:.3f} {plot}')
    ...
    -6: 0.002 x
    -5: 0.009    x
    -4: 0.031             x
    -3: 0.070                             x
    -2: 0.111                                             x
    -1: 0.125                                                   x
     0: 0.110                                            x
     1: 0.086                                   x
     2: 0.068                            x
     3: 0.059                        x
     4: 0.066                           x
     5: 0.082                                 x
     6: 0.082                                 x
     7: 0.058                        x
     8: 0.028            x
     9: 0.009    x
    10: 0.002 x

Estimate P(4.5 < X <= 7.5), the probability that a new sample value
will be between 4.5 and 7.5:

    >>> cdf = kde(sample, h=1.5, cumulative=True)
    >>> round(cdf(7.5) - cdf(4.5), 2)
    0.22

References
----------

Kernel density estimation and its application:
https://www.itm-conferences.org/articles/itmconf/pdf/2018/08/itmconf_sam2018_00037.pdf

Kernel functions in common use:
https://en.wikipedia.org/wiki/Kernel_(statistics)#kernel_functions_in_common_use

Interactive graphical demonstration and exploration:
https://demonstrations.wolfram.com/KernelDensityEstimation/

Kernel estimation of cumulative distribution function of a random variable with bounded support
https://www.econstor.eu/bitstream/10419/207829/1/10.21307_stattrans-2016-037.pdf

Vous êtes un professionnel et vous avez besoin d'une formation ? Calcul scientifique
avec Python Voir le programme détaillé

Le tutoriel Python complet (Text+Vidéos)

Le tutoriel Python en vidéos

Evaluez vos compétences en Python

Améliorations / Corrections

Fonction kde - module statistics

Signature de la fonction kde

Description

help(statistics.kde)