id_translation.dio.integration.pandas#

Integration for Pandas types.

Module Attributes

PandasT

Supported pandas types.

AsCategory

Valid as_category string values.

Classes

PandasIO(*[, level, missing_as_nan, as_category])

Optional IO implementation for pandas types.

AsCategory#

Valid as_category string values.

alias of Literal[‘exact’, ‘full’]

class PandasIO(*, level=-1, missing_as_nan=None, as_category=False)[source]#

Bases: DataStructureIO[PandasT, NameType, SourceType, IdType]

Optional IO implementation for pandas types.

Parameters:
  • level – Column level to use as names when translating a DataFrame with MultiIndex columns. See pandas.MultiIndex.get_level_values() for details. Ignored otherwise.

  • missing_as_nan – If set, unknown IDs will be NaN. Grouping operations will typically drop NaN values. If False, placeholders such as '<Failed: id=-1>' will be used instead. Default is True if as_category=True, False otherwise.

  • as_category – Set dtype=’category’ in the result. See Categorical translation for details.

Categorical translation#

Setting as_category=True converts the resultant translations to a categorical data type. The returned pandas.CategoricalDtype will be ordered, with the categories set to all real translations. If missing_as_nan=False, the categories may also include placeholders.

Certain fetchers, such as the MemoryFetcher(return_all=True), will return more IDs than requested. In this case the categories may also include values not present in the input data. This may also happen if data was prepared with Translator.go_offline(), or if multiple columns were mapped to the same source.

extract(translatable, names)[source]#

Extract IDs from translatable.

Parameters:
  • translatable – Data to extract IDs from.

  • names – List of names in translatable to extract IDs for.

Returns:

A dict {name: ids}.

classmethod handles_type(arg)[source]#

Return True if the implementation handles data for the type of arg.

insert(translatable, names, tmap, copy)[source]#

Insert translations into translatable.

Parameters:
  • translatable – Data to translate. Modified iff copy=False.

  • names – Names in translatable to translate.

  • tmap – Translations for IDs in translatable.

  • copy – If True, modify contents of the original translatable. Otherwise, returns a copy.

Returns:

A copy of translatable if copy=True, None otherwise.

Raises:

NotInplaceTranslatableError – If copy=False for a type which is not translatable in-place.

names(translatable)[source]#

Extract names from translatable.

Parameters:

translatable – Data to extract names from.

Returns:

A list of names to translate. Returns None if names cannot be extracted.

priority = 1999#

Determines order in which IOs are considered (higher = earlier).

Set priority < 0 to disable.

class PandasT#

Supported pandas types.

alias of TypeVar(‘PandasT’, ~pandas.DataFrame, ~pandas.Series, ~pandas.Index, ~pandas.MultiIndex)