id_translation.dio.integration.dask#
Integration for Dask types.
Module Attributes
Supported |
|
Supported |
|
A |
Functions
|
Translation a single Dask partition. |
Classes
|
Optional IO implementation for |
- class DaskIO(*, missing_as_nan=None, as_category=False)[source]#
Bases:
DataStructureIO[DaskT,str,SourceType,IdType]Optional IO implementation for
dasktypes.- Parameters:
Notes
Combining
missing_as_nan=Falsewithas_category=Truecan be unpredictable in distributed contexts.- classmethod extract(translatable, names)[source]#
Extract IDs from translatable.
- Parameters:
translatable – Data to extract IDs from.
names – List of names in translatable to extract IDs for.
- Returns:
A dict
{name: ids}.
- classmethod handles_type(arg)[source]#
Return
Trueif the implementation handles data for the type of arg.
- insert(translatable, names, tmap, copy)[source]#
Insert translations into translatable.
- Parameters:
translatable – Data to translate. Modified iff
copy=False.names – Names in translatable to translate.
tmap – Translations for IDs in translatable.
copy – If
True, modify contents of the original translatable. Otherwise, returns a copy.
- Returns:
A copy of translatable if
copy=True,Noneotherwise.- Raises:
NotInplaceTranslatableError – If
copy=Falsefor a type which is not translatable in-place.
- classmethod names(translatable)[source]#
Extract names from translatable.
- Parameters:
translatable – Data to extract names from.
- Returns:
A list of names to translate. Returns
Noneif names cannot be extracted.
- property partition_io#
The
PartitionIOimplementation used by this instance.
- priority = 1980#
Determines order in which IOs are considered (higher = earlier).
Set priority < 0 to disable.
- class DaskT#
Supported
dasktypes.alias of TypeVar(‘DaskT’, ~dask.dataframe.dask_expr._collection.DataFrame, ~dask.dataframe.dask_expr._collection.Series)
- PartitionIO#
A
daskpartition IO implementation.alias of
PandasIO[PartitionT,str,SourceType,IdType]
- class PartitionT#
Supported
daskpartition types.alias of TypeVar(‘PartitionT’, ~pandas.DataFrame, ~pandas.Series)