Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Vous êtes un professionnel et vous avez besoin d'une formation ? Programmation Python
Les fondamentaux
Voir le programme détaillé
Classe « DataFrame »

Méthode pandas.DataFrame.nlargest

Signature de la méthode nlargest

def nlargest(self, n: 'int', columns: 'IndexLabel', keep: 'NsmallestNlargestKeep' = 'first') -> 'DataFrame' 

Description

help(DataFrame.nlargest)

Return the first `n` rows ordered by `columns` in descending order.

Return the first `n` rows with the largest values in `columns`, in
descending order. The columns that are not specified are returned as
well, but not used for ordering.

This method is equivalent to
``df.sort_values(columns, ascending=False).head(n)``, but more
performant.

Parameters
----------
n : int
    Number of rows to return.
columns : label or list of labels
    Column label(s) to order by.
keep : {'first', 'last', 'all'}, default 'first'
    Where there are duplicate values:

    - ``first`` : prioritize the first occurrence(s)
    - ``last`` : prioritize the last occurrence(s)
    - ``all`` : keep all the ties of the smallest item even if it means
      selecting more than ``n`` items.

Returns
-------
DataFrame
    The first `n` rows ordered by the given columns in descending
    order.

See Also
--------
DataFrame.nsmallest : Return the first `n` rows ordered by `columns` in
    ascending order.
DataFrame.sort_values : Sort DataFrame by the values.
DataFrame.head : Return the first `n` rows without re-ordering.

Notes
-----
This function cannot be used with all column types. For example, when
specifying columns with `object` or `category` dtypes, ``TypeError`` is
raised.

Examples
--------
>>> df = pd.DataFrame({'population': [59000000, 65000000, 434000,
...                                   434000, 434000, 337000, 11300,
...                                   11300, 11300],
...                    'GDP': [1937894, 2583560 , 12011, 4520, 12128,
...                            17036, 182, 38, 311],
...                    'alpha-2': ["IT", "FR", "MT", "MV", "BN",
...                                "IS", "NR", "TV", "AI"]},
...                   index=["Italy", "France", "Malta",
...                          "Maldives", "Brunei", "Iceland",
...                          "Nauru", "Tuvalu", "Anguilla"])
>>> df
          population      GDP alpha-2
Italy       59000000  1937894      IT
France      65000000  2583560      FR
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN
Iceland       337000    17036      IS
Nauru          11300      182      NR
Tuvalu         11300       38      TV
Anguilla       11300      311      AI

In the following example, we will use ``nlargest`` to select the three
rows having the largest values in column "population".

>>> df.nlargest(3, 'population')
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Malta       434000    12011      MT

When using ``keep='last'``, ties are resolved in reverse order:

>>> df.nlargest(3, 'population', keep='last')
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Brunei      434000    12128      BN

When using ``keep='all'``, the number of element kept can go beyond ``n``
if there are duplicate values for the smallest element, all the
ties are kept:

>>> df.nlargest(3, 'population', keep='all')
          population      GDP alpha-2
France      65000000  2583560      FR
Italy       59000000  1937894      IT
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN

However, ``nlargest`` does not keep ``n`` distinct largest elements:

>>> df.nlargest(5, 'population', keep='all')
          population      GDP alpha-2
France      65000000  2583560      FR
Italy       59000000  1937894      IT
Malta         434000    12011      MT
Maldives      434000     4520      MV
Brunei        434000    12128      BN

To order by the largest values in column "population" and then "GDP",
we can specify multiple columns like in the next example.

>>> df.nlargest(3, ['population', 'GDP'])
        population      GDP alpha-2
France    65000000  2583560      FR
Italy     59000000  1937894      IT
Brunei      434000    12128      BN


Vous êtes un professionnel et vous avez besoin d'une formation ? Coder avec une
Intelligence Artificielle
Voir le programme détaillé