Translation IO#
The id_translation.dio module defines how IDs are read and written to various data structures.
User-defined integrations#
The purpose of creating new integrations is typically to enable translation of a new data type.
To get started, inherit from DataStructureIO or copy an
existing integration. Don’t forget to
register the implementation, or the Translator won’t be able to find it.
Automatic integration discovery#
You may add an entrypoint in the 'id_translation.dio' entrypoint group to
automatically register custom implementations (as opposed to calling DataStructureIO.register() manually). The
snippet below shows how the bundled integrations are registered using project entrypoints.
pyproject.toml in the
rsundqvist/id-translation project.#[project.entry-points."id_translation.dio"]
# The name (e.g. 'pandas_io') is not important, but should be unique.
pandas_io = "id_translation.dio.integration.pandas:PandasIO"
dask_io = "id_translation.dio.integration.dask:DaskIO"
polars_io = "id_translation.dio.integration.polars:PolarsIO"
The loader will skip the integration if calling
EntryPoint.load() raises an ImportError,
or if the priority is negative.
Selection process#
The Translator will call resolve_io() once per task. The first implementation whose
DataStructureIO.handles_type()-method returns True will be used. The order in which implementations are
considered is determined by the priority attribute.
Bundled implementations have priorities in the 1000 - 1999 range (inclusive); see the table below.
Rank |
Weight |
Class |
Comment |
|---|---|---|---|
1 |
1999 |
Optional IO implementation for |
|
2 |
1990 |
Optional IO implementation for |
|
3 |
1980 |
Optional IO implementation for |
|
4 |
1500 |
IO implementation for |
|
5 |
1100 |
IO implementation for |
|
6 |
1010 |
IO implementation for |
|
7 |
1000 |
IO implementation for |
New implementations default to priority=10_000 and are therefore considered first.