Participer au site avec un Tip
Rechercher
 

Améliorations / Corrections

Vous avez des améliorations (ou des corrections) à proposer pour ce document : je vous remerçie par avance de m'en faire part, cela m'aide à améliorer le site.

Emplacement :

Description des améliorations :

Vous êtes un professionnel et vous avez besoin d'une formation ? Programmation Python
Les compléments
Voir le programme détaillé
Classe « DataFrame »

Méthode pandas.DataFrame.to_orc

Signature de la méthode to_orc

def to_orc(self, path: 'FilePath | WriteBuffer[bytes] | None' = None, *, engine: "Literal['pyarrow']" = 'pyarrow', index: 'bool | None' = None, engine_kwargs: 'dict[str, Any] | None' = None) -> 'bytes | None' 

Description

help(DataFrame.to_orc)

Write a DataFrame to the ORC format.

.. versionadded:: 1.5.0

Parameters
----------
path : str, file-like object or None, default None
    If a string, it will be used as Root Directory path
    when writing a partitioned dataset. By file-like object,
    we refer to objects with a write() method, such as a file handle
    (e.g. via builtin open function). If path is None,
    a bytes object is returned.
engine : {'pyarrow'}, default 'pyarrow'
    ORC library to use.
index : bool, optional
    If ``True``, include the dataframe's index(es) in the file output.
    If ``False``, they will not be written to the file.
    If ``None``, similar to ``infer`` the dataframe's index(es)
    will be saved. However, instead of being saved as values,
    the RangeIndex will be stored as a range in the metadata so it
    doesn't require much space and is faster. Other indexes will
    be included as columns in the file output.
engine_kwargs : dict[str, Any] or None, default None
    Additional keyword arguments passed to :func:`pyarrow.orc.write_table`.

Returns
-------
bytes if no path argument is provided else None

Raises
------
NotImplementedError
    Dtype of one or more columns is category, unsigned integers, interval,
    period or sparse.
ValueError
    engine is not pyarrow.

See Also
--------
read_orc : Read a ORC file.
DataFrame.to_parquet : Write a parquet file.
DataFrame.to_csv : Write a csv file.
DataFrame.to_sql : Write to a sql table.
DataFrame.to_hdf : Write to hdf.

Notes
-----
* Before using this function you should read the :ref:`user guide about
  ORC <io.orc>` and :ref:`install optional dependencies <install.warn_orc>`.
* This function requires `pyarrow <https://arrow.apache.org/docs/python/>`_
  library.
* For supported dtypes please refer to `supported ORC features in Arrow
  <https://arrow.apache.org/docs/cpp/orc.html#data-types>`__.
* Currently timezones in datetime columns are not preserved when a
  dataframe is converted into ORC files.

Examples
--------
>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [4, 3]})
>>> df.to_orc('df.orc')  # doctest: +SKIP
>>> pd.read_orc('df.orc')  # doctest: +SKIP
   col1  col2
0     1     4
1     2     3

If you want to get a buffer to the orc content you can write it to io.BytesIO

>>> import io
>>> b = io.BytesIO(df.to_orc())  # doctest: +SKIP
>>> b.seek(0)  # doctest: +SKIP
0
>>> content = b.read()  # doctest: +SKIP


Vous êtes un professionnel et vous avez besoin d'une formation ? Mise en oeuvre d'IHM
avec Qt et PySide6
Voir le programme détaillé