Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Module « scipy.stats »

Classe « NumericalInverseHermite »

Informations générales

Héritage

builtins.object
    NumericalInverseHermite

Définition

class NumericalInverseHermite(builtins.object):

Description [extrait de NumericalInverseHermite.__doc__]

    A Hermite spline fast numerical inverse of a probability distribution.

    The initializer of `NumericalInverseHermite` accepts `dist`, an object
    representing a continuous distribution, and provides an object with methods
    that approximate `dist.ppf` and `dist.rvs`. For most distributions,
    these methods are faster than those of `dist` itself.

    Parameters
    ----------
    dist : object
        Object representing the distribution for which a fast numerical inverse
        is desired; for instance, a frozen instance of a `scipy.stats`
        continuous distribution. See Notes and Examples for details.
    tol : float, optional
        u-error tolerance (see Notes). The default is 1e-12.
    max_intervals : int, optional
        Maximum number of intervals in the cubic Hermite spline used to
        approximate the percent point function. The default is 100000.

    Attributes
    ----------
    intervals : int
        The number of intervals of the interpolant.
    midpoint_error : float
        The maximum u-error at an interpolant interval midpoint.

    Notes
    -----
    `NumericalInverseHermite` approximates the inverse of a continuous
    statistical distribution's CDF with a cubic Hermite spline.

    As described in [1]_, it begins by evaluating the distribution's PDF and
    CDF at a mesh of quantiles ``x`` within the distribution's support.
    It uses the results to fit a cubic Hermite spline ``H`` such that
    ``H(p) == x``, where ``p`` is the array of percentiles corresponding
    with the quantiles ``x``. Therefore, the spline approximates the inverse
    of the distribution's CDF to machine precision at the percentiles ``p``,
    but typically, the spline will not be as accurate at the midpoints between
    the percentile points::

        p_mid = (p[:-1] + p[1:])/2

    so the mesh of quantiles is refined as needed to reduce the maximum
    "u-error"::

        u_error = np.max(np.abs(dist.cdf(H(p_mid)) - p_mid))

    below the specified tolerance `tol`. Refinement stops when the required
    tolerance is achieved or when the number of mesh intervals after the next
    refinement could exceed the maximum allowed number `max_intervals`.

    The object `dist` must have methods ``pdf``, ``cdf``, and ``ppf`` that
    behave like those of a *frozen* instance of `scipy.stats.rv_continuous`.
    Specifically, it must have methods ``pdf`` and ``cdf`` that accept exactly
    one ndarray argument ``x`` and return the probability density function and
    cumulative density function (respectively) at ``x``. The object must also
    have a method ``ppf`` that accepts a float ``p`` and returns the percentile
    point function at ``p``. The object may also have a method ``isf`` that
    accepts a float ``p`` and returns the inverse survival function at ``p``;
    if it does not, it will be assigned an attribute ``isf`` that calculates
    the inverse survival function using ``ppf``. The ``ppf`` and
    ``isf` methods will each be evaluated at a small positive float ``p``
    (e.g. ``p = utol/10``), and the domain over which the approximate numerical
    inverse is defined will be ``ppf(p)`` to ``isf(p)``. The approximation will
    not be accurate in the extreme tails beyond this domain.

    References
    ----------
    .. [1] Hörmann, Wolfgang, and Josef Leydold. "Continuous random variate
           generation by fast numerical inversion." ACM Transactions on
           Modeling and Computer Simulation (TOMACS) 13.4 (2003): 347-362.

    Examples
    --------
    For some distributions, ``dist.ppf`` and ``dist.rvs`` are quite slow.
    For instance, consider `scipy.stats.genexpon`. We freeze the distribution
    by passing all shape parameters into its initializer and time the resulting
    object's ``ppf`` and ``rvs`` functions.

    >>> import numpy as np
    >>> from scipy import stats
    >>> from timeit import timeit
    >>> time_once = lambda f: f"{timeit(f, number=1)*1000:.6} ms"
    >>> dist = stats.genexpon(9, 16, 3)  # freeze the distribution
    >>> p = np.linspace(0.01, 0.99, 99)  # percentiles from 1% to 99%
    >>> time_once(lambda: dist.ppf(p))
    '154.565 ms'  # may vary

    >>> time_once(lambda: dist.rvs(size=100))
    '148.979 ms'  # may vary

    The `NumericalInverseHermite` has a method that approximates ``dist.ppf``.

    >>> from scipy.stats import NumericalInverseHermite
    >>> fni = NumericalInverseHermite(dist)
    >>> np.allclose(fni.ppf(p), dist.ppf(p))
    True

    In some cases, it is faster to both generate the fast numerical inverse
    and use it than to call ``dist.ppf``.

    >>> def time_me():
    ...     fni = NumericalInverseHermite(dist)
    ...     fni.ppf(p)
    >>> time_once(time_me)
    '11.9222 ms'  # may vary

    After generating the fast numerical inverse, subsequent calls to its
    methods are much faster.
    >>> time_once(lambda: fni.ppf(p))
    '0.0819 ms'  # may vary

    The fast numerical inverse can also be used to generate random variates
    using inverse transform sampling.

    >>> time_once(lambda: fni.rvs(size=100))
    '0.0911 ms'  # may vary

    Depending on the implementation of the distribution's random sampling
    method, the random variates generated may be nearly identical, given
    the same random state.

    >>> # `seed` ensures identical random streams are used by each `rvs` method
    >>> seed = 500072020
    >>> rvs1 = dist.rvs(size=100, random_state=np.random.default_rng(seed))
    >>> rvs2 = fni.rvs(size=100, random_state=np.random.default_rng(seed))
    >>> np.allclose(rvs1, rvs2)
    True

    To use `NumericalInverseHermite` with a custom distribution, users may
    subclass  `scipy.stats.rv_continuous` and initialize a frozen instance or
    create an object with equivalent ``pdf``, ``cdf``, and ``ppf`` methods.
    For instance, the following object represents the standard normal
    distribution. For simplicity, we use `scipy.special.ndtr` and
    `scipy.special.ndtri` to compute the ``cdf`` and ``ppf``, respectively.

    >>> from scipy.special import ndtr, ndtri
    >>>
    >>> class MyNormal:
    ...
    ...     def pdf(self, x):
    ...        return 1/np.sqrt(2*np.pi) * np.exp(-x**2 / 2)
    ...
    ...     def cdf(self, x):
    ...        return ndtr(x)
    ...
    ...     def ppf(self, x):
    ...        return ndtri(x)
    ...
    >>> dist1 = MyNormal()
    >>> fni1 = NumericalInverseHermite(dist1)
    >>>
    >>> dist2 = stats.norm()
    >>> fni2 = NumericalInverseHermite(dist2)
    >>>
    >>> print(fni1.rvs(random_state=seed), fni2.rvs(random_state=seed))
    -1.9603810921759424 -1.9603810921747074

    

Constructeur(s)

Signature du constructeur Description
__init__(self, dist, *, tol=1e-12, max_intervals=100000)

Liste des opérateurs

Opérateurs hérités de la classe object

__eq__, __ge__, __gt__, __le__, __lt__, __ne__

Liste des méthodes

Toutes les méthodes Méthodes d'instance Méthodes statiques Méthodes dépréciées
Signature de la méthodeDescription
ppf(self, q)
qrvs(self, size=None, d=None, qmc_engine=None)
rvs(self, size=None, random_state=None)

Méthodes héritées de la classe object

__delattr__, __dir__, __format__, __getattribute__, __hash__, __init_subclass__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__