Translation IO#
The id_translation.dio module defines how IDs are read and written to various data structures.
Runtime arguments#
Relevant methods (e.g. Translator.translate()) accept an io_kwargs argument, which may be used to customize
the behavior of the DataStructureIO implementation. Exceptions raised due to invalid io_kwargs arguments are
logged and suppressed.
Arguments are implementation-specific. See PandasIO for an example.
User-defined integrations#
The purpose of creating new integrations is typically to enable translation of a new data type.
To get started, inherit from DataStructureIO or copy an
existing integration. Don’t forget to
register the implementation, or the Translator won’t be able to find it.
Integrations may take initialization arguments (see Runtime arguments), but should not require them.
Automatic integration discovery#
You may add an entrypoint in the 'id_translation.dio' entrypoint group to
automatically register custom implementations (as opposed to calling DataStructureIO.register() manually). The
snippet below shows how the bundled integrations are registered using project entrypoints.
pyproject.toml in the
rsundqvist/id-translation project.#[project.entry-points."id_translation.dio"]
# The name (e.g. 'pandas_io') is not important, but should be unique.
pandas_io = "id_translation.dio.integration.pandas:PandasIO"
dask_io = "id_translation.dio.integration.dask:DaskIO"
polars_io = "id_translation.dio.integration.polars:PolarsIO"
The loader will skip the integration if calling
EntryPoint.load() raises an ImportError,
or if the priority is negative.
Selection process#
The Translator will call resolve_io() once per task. The first implementation whose
DataStructureIO.handles_type()-method returns True will be used. The order in which implementations are
considered is determined by the priority attribute.
Bundled implementations have priorities in the 1000 - 1999 range (inclusive); see the table below.
Rank |
Weight |
Class |
Comment |
|---|---|---|---|
1 |
1999 |
Optional IO implementation for |
|
2 |
1990 |
Optional IO implementation for |
|
3 |
1980 |
Optional IO implementation for |
|
4 |
1900 |
Optional IO implementation for |
|
5 |
1500 |
IO implementation for |
|
6 |
1100 |
IO implementation for |
|
7 |
1010 |
IO implementation for |
|
8 |
1000 |
IO implementation for |
New implementations default to priority=10_000, and are therefore considered first.
Footnotes