Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Classe « Generator »

Méthode numpy.random.Generator.multivariate_hypergeometric

Signature de la méthode multivariate_hypergeometric

Description

multivariate_hypergeometric.__doc__

        multivariate_hypergeometric(colors, nsample, size=None,
                                    method='marginals')

        Generate variates from a multivariate hypergeometric distribution.

        The multivariate hypergeometric distribution is a generalization
        of the hypergeometric distribution.

        Choose ``nsample`` items at random without replacement from a
        collection with ``N`` distinct types.  ``N`` is the length of
        ``colors``, and the values in ``colors`` are the number of occurrences
        of that type in the collection.  The total number of items in the
        collection is ``sum(colors)``.  Each random variate generated by this
        function is a vector of length ``N`` holding the counts of the
        different types that occurred in the ``nsample`` items.

        The name ``colors`` comes from a common description of the
        distribution: it is the probability distribution of the number of
        marbles of each color selected without replacement from an urn
        containing marbles of different colors; ``colors[i]`` is the number
        of marbles in the urn with color ``i``.

        Parameters
        ----------
        colors : sequence of integers
            The number of each type of item in the collection from which
            a sample is drawn.  The values in ``colors`` must be nonnegative.
            To avoid loss of precision in the algorithm, ``sum(colors)``
            must be less than ``10**9`` when `method` is "marginals".
        nsample : int
            The number of items selected.  ``nsample`` must not be greater
            than ``sum(colors)``.
        size : int or tuple of ints, optional
            The number of variates to generate, either an integer or a tuple
            holding the shape of the array of variates.  If the given size is,
            e.g., ``(k, m)``, then ``k * m`` variates are drawn, where one
            variate is a vector of length ``len(colors)``, and the return value
            has shape ``(k, m, len(colors))``.  If `size` is an integer, the
            output has shape ``(size, len(colors))``.  Default is None, in
            which case a single variate is returned as an array with shape
            ``(len(colors),)``.
        method : string, optional
            Specify the algorithm that is used to generate the variates.
            Must be 'count' or 'marginals' (the default).  See the Notes
            for a description of the methods.

        Returns
        -------
        variates : ndarray
            Array of variates drawn from the multivariate hypergeometric
            distribution.

        See Also
        --------
        hypergeometric : Draw samples from the (univariate) hypergeometric
            distribution.

        Notes
        -----
        The two methods do not return the same sequence of variates.

        The "count" algorithm is roughly equivalent to the following numpy
        code::

            choices = np.repeat(np.arange(len(colors)), colors)
            selection = np.random.choice(choices, nsample, replace=False)
            variate = np.bincount(selection, minlength=len(colors))

        The "count" algorithm uses a temporary array of integers with length
        ``sum(colors)``.

        The "marginals" algorithm generates a variate by using repeated
        calls to the univariate hypergeometric sampler.  It is roughly
        equivalent to::

            variate = np.zeros(len(colors), dtype=np.int64)
            # `remaining` is the cumulative sum of `colors` from the last
            # element to the first; e.g. if `colors` is [3, 1, 5], then
            # `remaining` is [9, 6, 5].
            remaining = np.cumsum(colors[::-1])[::-1]
            for i in range(len(colors)-1):
                if nsample < 1:
                    break
                variate[i] = hypergeometric(colors[i], remaining[i+1],
                                           nsample)
                nsample -= variate[i]
            variate[-1] = nsample

        The default method is "marginals".  For some cases (e.g. when
        `colors` contains relatively small integers), the "count" method
        can be significantly faster than the "marginals" method.  If
        performance of the algorithm is important, test the two methods
        with typical inputs to decide which works best.

        .. versionadded:: 1.18.0

        Examples
        --------
        >>> colors = [16, 8, 4]
        >>> seed = 4861946401452
        >>> gen = np.random.Generator(np.random.PCG64(seed))
        >>> gen.multivariate_hypergeometric(colors, 6)
        array([5, 0, 1])
        >>> gen.multivariate_hypergeometric(colors, 6, size=3)
        array([[5, 0, 1],
               [2, 2, 2],
               [3, 3, 0]])
        >>> gen.multivariate_hypergeometric(colors, 6, size=(2, 2))
        array([[[3, 2, 1],
                [3, 2, 1]],
               [[4, 1, 1],
                [3, 2, 1]]])