Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Vous êtes un professionnel et vous avez besoin d'une formation ? Mise en oeuvre d'IHM
avec Qt et PySide6
Voir le programme détaillé
Module « scipy.stats »

Fonction pointbiserialr - module scipy.stats

Signature de la fonction pointbiserialr

def pointbiserialr(x, y) 

Description

help(scipy.stats.pointbiserialr)

Calculate a point biserial correlation coefficient and its p-value.

The point biserial correlation is used to measure the relationship
between a binary variable, x, and a continuous variable, y. Like other
correlation coefficients, this one varies between -1 and +1 with 0
implying no correlation. Correlations of -1 or +1 imply a determinative
relationship.

This function may be computed using a shortcut formula but produces the
same result as `pearsonr`.

Parameters
----------
x : array_like of bools
    Input array.
y : array_like
    Input array.

Returns
-------
res: SignificanceResult
    An object containing attributes:

    statistic : float
        The R value.
    pvalue : float
        The two-sided p-value.

Notes
-----
`pointbiserialr` uses a t-test with ``n-1`` degrees of freedom.
It is equivalent to `pearsonr`.

The value of the point-biserial correlation can be calculated from:

.. math::

    r_{pb} = \frac{\overline{Y_1} - \overline{Y_0}}
                  {s_y}
             \sqrt{\frac{N_0 N_1}
                        {N (N - 1)}}

Where :math:`\overline{Y_{0}}` and :math:`\overline{Y_{1}}` are means
of the metric observations coded 0 and 1 respectively; :math:`N_{0}` and
:math:`N_{1}` are number of observations coded 0 and 1 respectively;
:math:`N` is the total number of observations and :math:`s_{y}` is the
standard deviation of all the metric observations.

A value of :math:`r_{pb}` that is significantly different from zero is
completely equivalent to a significant difference in means between the two
groups. Thus, an independent groups t Test with :math:`N-2` degrees of
freedom may be used to test whether :math:`r_{pb}` is nonzero. The
relation between the t-statistic for comparing two independent groups and
:math:`r_{pb}` is given by:

.. math::

    t = \sqrt{N - 2}\frac{r_{pb}}{\sqrt{1 - r^{2}_{pb}}}

References
----------
.. [1] J. Lev, "The Point Biserial Coefficient of Correlation", Ann. Math.
       Statist., Vol. 20, no.1, pp. 125-126, 1949.

.. [2] R.F. Tate, "Correlation Between a Discrete and a Continuous
       Variable. Point-Biserial Correlation.", Ann. Math. Statist., Vol. 25,
       np. 3, pp. 603-607, 1954.

.. [3] D. Kornbrot "Point Biserial Correlation", In Wiley StatsRef:
       Statistics Reference Online (eds N. Balakrishnan, et al.), 2014.
       :doi:`10.1002/9781118445112.stat06227`

Examples
--------
>>> import numpy as np
>>> from scipy import stats
>>> a = np.array([0, 0, 0, 1, 1, 1, 1])
>>> b = np.arange(7)
>>> stats.pointbiserialr(a, b)
(0.8660254037844386, 0.011724811003954652)
>>> stats.pearsonr(a, b)
(0.86602540378443871, 0.011724811003954626)
>>> np.corrcoef(a, b)
array([[ 1.       ,  0.8660254],
       [ 0.8660254,  1.       ]])



Vous êtes un professionnel et vous avez besoin d'une formation ? Machine Learning
avec Scikit-Learn
Voir le programme détaillé