Module « scipy.stats »
Classe « NumericalInverseHermite »
Informations générales
Héritage
builtins.object
NumericalInverseHermite
Définition
class NumericalInverseHermite(builtins.object):
Description [extrait de NumericalInverseHermite.__doc__]
A Hermite spline fast numerical inverse of a probability distribution.
The initializer of `NumericalInverseHermite` accepts `dist`, an object
representing a continuous distribution, and provides an object with methods
that approximate `dist.ppf` and `dist.rvs`. For most distributions,
these methods are faster than those of `dist` itself.
Parameters
----------
dist : object
Object representing the distribution for which a fast numerical inverse
is desired; for instance, a frozen instance of a `scipy.stats`
continuous distribution. See Notes and Examples for details.
tol : float, optional
u-error tolerance (see Notes). The default is 1e-12.
max_intervals : int, optional
Maximum number of intervals in the cubic Hermite spline used to
approximate the percent point function. The default is 100000.
Attributes
----------
intervals : int
The number of intervals of the interpolant.
midpoint_error : float
The maximum u-error at an interpolant interval midpoint.
Notes
-----
`NumericalInverseHermite` approximates the inverse of a continuous
statistical distribution's CDF with a cubic Hermite spline.
As described in [1]_, it begins by evaluating the distribution's PDF and
CDF at a mesh of quantiles ``x`` within the distribution's support.
It uses the results to fit a cubic Hermite spline ``H`` such that
``H(p) == x``, where ``p`` is the array of percentiles corresponding
with the quantiles ``x``. Therefore, the spline approximates the inverse
of the distribution's CDF to machine precision at the percentiles ``p``,
but typically, the spline will not be as accurate at the midpoints between
the percentile points::
p_mid = (p[:-1] + p[1:])/2
so the mesh of quantiles is refined as needed to reduce the maximum
"u-error"::
u_error = np.max(np.abs(dist.cdf(H(p_mid)) - p_mid))
below the specified tolerance `tol`. Refinement stops when the required
tolerance is achieved or when the number of mesh intervals after the next
refinement could exceed the maximum allowed number `max_intervals`.
The object `dist` must have methods ``pdf``, ``cdf``, and ``ppf`` that
behave like those of a *frozen* instance of `scipy.stats.rv_continuous`.
Specifically, it must have methods ``pdf`` and ``cdf`` that accept exactly
one ndarray argument ``x`` and return the probability density function and
cumulative density function (respectively) at ``x``. The object must also
have a method ``ppf`` that accepts a float ``p`` and returns the percentile
point function at ``p``. The object may also have a method ``isf`` that
accepts a float ``p`` and returns the inverse survival function at ``p``;
if it does not, it will be assigned an attribute ``isf`` that calculates
the inverse survival function using ``ppf``. The ``ppf`` and
``isf` methods will each be evaluated at a small positive float ``p``
(e.g. ``p = utol/10``), and the domain over which the approximate numerical
inverse is defined will be ``ppf(p)`` to ``isf(p)``. The approximation will
not be accurate in the extreme tails beyond this domain.
References
----------
.. [1] Hörmann, Wolfgang, and Josef Leydold. "Continuous random variate
generation by fast numerical inversion." ACM Transactions on
Modeling and Computer Simulation (TOMACS) 13.4 (2003): 347-362.
Examples
--------
For some distributions, ``dist.ppf`` and ``dist.rvs`` are quite slow.
For instance, consider `scipy.stats.genexpon`. We freeze the distribution
by passing all shape parameters into its initializer and time the resulting
object's ``ppf`` and ``rvs`` functions.
>>> import numpy as np
>>> from scipy import stats
>>> from timeit import timeit
>>> time_once = lambda f: f"{timeit(f, number=1)*1000:.6} ms"
>>> dist = stats.genexpon(9, 16, 3) # freeze the distribution
>>> p = np.linspace(0.01, 0.99, 99) # percentiles from 1% to 99%
>>> time_once(lambda: dist.ppf(p))
'154.565 ms' # may vary
>>> time_once(lambda: dist.rvs(size=100))
'148.979 ms' # may vary
The `NumericalInverseHermite` has a method that approximates ``dist.ppf``.
>>> from scipy.stats import NumericalInverseHermite
>>> fni = NumericalInverseHermite(dist)
>>> np.allclose(fni.ppf(p), dist.ppf(p))
True
In some cases, it is faster to both generate the fast numerical inverse
and use it than to call ``dist.ppf``.
>>> def time_me():
... fni = NumericalInverseHermite(dist)
... fni.ppf(p)
>>> time_once(time_me)
'11.9222 ms' # may vary
After generating the fast numerical inverse, subsequent calls to its
methods are much faster.
>>> time_once(lambda: fni.ppf(p))
'0.0819 ms' # may vary
The fast numerical inverse can also be used to generate random variates
using inverse transform sampling.
>>> time_once(lambda: fni.rvs(size=100))
'0.0911 ms' # may vary
Depending on the implementation of the distribution's random sampling
method, the random variates generated may be nearly identical, given
the same random state.
>>> # `seed` ensures identical random streams are used by each `rvs` method
>>> seed = 500072020
>>> rvs1 = dist.rvs(size=100, random_state=np.random.default_rng(seed))
>>> rvs2 = fni.rvs(size=100, random_state=np.random.default_rng(seed))
>>> np.allclose(rvs1, rvs2)
True
To use `NumericalInverseHermite` with a custom distribution, users may
subclass `scipy.stats.rv_continuous` and initialize a frozen instance or
create an object with equivalent ``pdf``, ``cdf``, and ``ppf`` methods.
For instance, the following object represents the standard normal
distribution. For simplicity, we use `scipy.special.ndtr` and
`scipy.special.ndtri` to compute the ``cdf`` and ``ppf``, respectively.
>>> from scipy.special import ndtr, ndtri
>>>
>>> class MyNormal:
...
... def pdf(self, x):
... return 1/np.sqrt(2*np.pi) * np.exp(-x**2 / 2)
...
... def cdf(self, x):
... return ndtr(x)
...
... def ppf(self, x):
... return ndtri(x)
...
>>> dist1 = MyNormal()
>>> fni1 = NumericalInverseHermite(dist1)
>>>
>>> dist2 = stats.norm()
>>> fni2 = NumericalInverseHermite(dist2)
>>>
>>> print(fni1.rvs(random_state=seed), fni2.rvs(random_state=seed))
-1.9603810921759424 -1.9603810921747074
Constructeur(s)
Liste des opérateurs
Opérateurs hérités de la classe object
__eq__,
__ge__,
__gt__,
__le__,
__lt__,
__ne__
Liste des méthodes
Toutes les méthodes
Méthodes d'instance
Méthodes statiques
Méthodes dépréciées
Méthodes héritées de la classe object
__delattr__,
__dir__,
__format__,
__getattribute__,
__hash__,
__init_subclass__,
__reduce__,
__reduce_ex__,
__repr__,
__setattr__,
__sizeof__,
__str__,
__subclasshook__
Améliorations / Corrections
Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.
Emplacement :
Description des améliorations :