id_translation.dio#

Integration for insertion and extraction of IDs and translations to and from various data structures.

User-defined integrations#

The purpose of creating new integrations is typically to enable translation of a new data type. To get started, inherit from DataStructureIO or copy an existing integration. Don’t forget to register the implementation, or the Translator won’t be able to find it.

Automatic integration discovery#

You may add an entrypoint in the 'id_translation.dio' entrypoint group to automatically register custom implementations (as opposed to calling DataStructureIO.register() manually). The snippet below shows how the bundled integrations are registered using project entrypoints.

Entrypoints in pyproject.toml in the rsundqvist/id-translation project.#
[project.entry-points."id_translation.dio"]
# The name (e.g. 'dask_io') is not important, but should be unique.
dask_io = "id_translation.dio.integration.dask:DaskIO"
polars_io = "id_translation.dio.integration.polars:PolarsIO"

The loader will skip the integration if calling EntryPoint.load() raises an ImportError.

Module Attributes

ENTRYPOINT_GROUP

Group used to discover DataStructureIO integrations.

Functions

get_resolution_order(*[, real])

Returns known DataStructureIO implementations.

is_registered(io)

Return IO implementation registration status.

load_integrations()

Discover, load and register entrypoint integrations.

register_io(io)

Register a new IO implementation.

resolve_io(arg, **kwargs)

Get an IO instance for arg.

Classes

DataStructureIO()

Insertion and extraction of IDs and translations.

class DataStructureIO[source]#

Bases: Generic[TranslatableT, NameType, SourceType, IdType]

Insertion and extraction of IDs and translations.

abstract extract(translatable, names)[source]#

Extract IDs from translatable.

Parameters:
  • translatable – Data to extract IDs from.

  • names – List of names in translatable to extract IDs for.

Returns:

A dict {name: ids}.

classmethod get_rank()[source]#

Return the rank of this implementation.

See dio.get_resolution_order() for details.

Returns:

Implementation rank.

Raises:

ValueError – If the implementation is not registered.

abstract classmethod handles_type(arg)[source]#

Return True if the implementation handles data for the type of arg.

abstract insert(translatable, names, tmap, copy)[source]#

Insert translations into translatable.

Parameters:
  • translatable – Data to translate. Modified iff copy=False.

  • names – Names in translatable to translate.

  • tmap – Translations for IDs in translatable.

  • copy – If True, modify contents of the original translatable. Otherwise, returns a copy.

Returns:

A copy of translatable if copy=True, None otherwise.

Raises:

NotInplaceTranslatableError – If copy=False for a type which is not translatable in-place.

classmethod is_registered()[source]#

Returns registration status for this implementation.

See dio.is_registered() for details.

names(translatable)[source]#

Extract names from translatable.

Parameters:

translatable – Data to extract names from.

Returns:

A list of names to translate. Returns None if names cannot be extracted.

classmethod register()[source]#

Register this implementation for all Translator instances.

See dio.register_io() for details.

ENTRYPOINT_GROUP = 'id_translation.dio'#

Group used to discover DataStructureIO integrations.

See load_integrations() and importlib.metadata.entry_points() for details.

get_resolution_order(*, real=False)[source]#

Returns known DataStructureIO implementations.

Parameters:

real – If True, return the actual list instead of a copy.

Returns:

A list of IO implementations.

is_registered(io)[source]#

Return IO implementation registration status.

Instances should register themselves using register_io() or DataStructureIO.is_registered().

Parameters:

io – A DataStructureIO type.

load_integrations()[source]#

Discover, load and register entrypoint integrations.

Reset the registry, then load entrypoints in the 'id_translation.dio' entrypoint group (see importlib.metadata.entry_points() for details).

Will skip integrations that raise ImportError when loaded.

Raises:

TypeError – If an integration does not inherit from DataStructureIO.

register_io(io)[source]#

Register a new IO implementation.

Classes are polled through DataStructureIO.handles_type() in reverse insertion order (new implementations are polled first). Re-registering an implementation again will move it to the first position in the search order.

Parameters:

io – A DataStructureIO type.

resolve_io(arg, **kwargs)[source]#

Get an IO instance for arg.

Parameters:
  • arg – An argument to get IO for.

  • **kwargs – Keyword arguments for the IO class.

Returns:

A data structure IO instance for arg.

Raises:

UntranslatableTypeError – If no suitable IO implementation could be found.

See also

The register_io() function.

Modules

exceptions

Data structure IO exceptions.

integration

Integration modules.