id_translation.offline#

Offline (in-memory) translation classes.

Classes

Format(fmt)

Format specification for translations strings.

FormatApplier(translations, *[, transformer])

Application of Format specifications.

MagicDict(real_translations[, ...])

Dictionary type for translated IDs.

TranslationMap(source_translations, *[, ...])

Storage class for fetched translations.

class Format(fmt)[source]#

Bases: object

Format specification for translations strings.

Translation formats are similar to regular f-strings, with two important exceptions:

  1. Positional placeholders ('{}') may not be used; correct form is '{placeholder-name}'.

  2. Placeholders surrounded by '[]' denote an optional element. Optional elements are rendered…

    • Only if all of its placeholders are defined.

    • Without delimiting brackets.

    • As literal text (with brackets) if there is no placeholder in the block.

Hint

Double the wanted bracket character to render as a literal, analogous to '{{' and '}}' in plain Python f-strings. See the example below for a demonstration.

Parameters:

fmt – A translation fstring.

Examples

Basic usage

Formats are created by passing a single str arguments, as described above.

>>> Format(Format.DEFAULT)
"Format('{id}:{name}')"
>>> Format(Format.DEFAULT).fstring().format(id=1, name="First")
'1:First'

Using Format.fstring() and str.format() is flexible but verbose. Formats can be applier either through Format.format()

>>> fmt = Format(Format.DEFAULT_FAILED)
>>> fmt.format(id=1, name="First")

…or just Format.__call__().

>>> fmt(id=1, name="First")
'<Failed: id=1>'

Using either convenience method will use as many placeholders as possible.

>>> fmt = Format(Format.DEFAULT)
>>> fmt.placeholders
"('id', 'name')"
>>> fmt(id=1, name="First")
'1:First'
>>> fmt(id=1, name="First", unknown=20.19)
'1:First'

Unknown placeholders are simply ignored.

Optional placeholders

A format string using literal angle brackets and an optional element.

>>> from id_translation.offline import Format
>>> fmt = Format("{id}:[[{name}]][, nice={is_nice}]")

The Format class when used directly only returns required placeholders by default…

>>> fmt.fstring()
'{id}:[{name}]'
>>> fmt(id=0, name="Tarzan")
'0:[Tarzan]'

…but the placeholders attribute can be used to retrieve all placeholders, required and optional:

>>> fmt.placeholders
('id', 'name', 'is_nice')
>>> fmt(id=1, name="Morris", is_nice=True)
'1:[Morris], nice=True'

The Translator will automatically add optional placeholders, if they are present in the source.

Note

Python format specifications and conversions are preserved.

This is especially useful for long values such as UUIDs.

>>> from uuid import UUID
>>> uuid = UUID("550e8400-e29b-41d4-a716-446655440000")

Convert to string and truncate to eight characters.

>>> Format("{id!s:.8}:{name!r}").format(id=uuid, name="Sofia")
"550e8400:'Sofia'"

See the official Format Specification Mini-Language documentation for details.

DEFAULT = '{id}:{name}'#

Default translation format.

DEFAULT_FAILED = '<Failed: id={id!r}>'#

Default format for missing IDs.

format(**placeholders)[source]#

Apply the format.

Parameters:

**placeholders – Formats to use in the finals string.

Returns:

Formatting using placeholders.

fstring(placeholders=None, *, positional=False)[source]#

Create a format string for the given placeholders.

Parameters:
  • placeholders – Keys to keep. Passing None is equivalent to passing required_placeholders.

  • positional – If True, remove names to return a positional fstring.

Returns:

An fstring with optional elements removed unless included in placeholders.

Raises:

KeyError – If required placeholders are missing.

property optional_placeholders#

All optional placeholders in the order in which they appear.

static parse(fmt)[source]#

Parse a format.

Parameters:

fmt – Input to parse.

Returns:

A Format instance.

partial(defaults)[source]#

Get a partially formatted fstring().

Parameters:

defaults – Keys which should be replaced with real values. Keys which are not part of defaults will be left as-is.

Returns:

A partially formatted fstring.

property placeholders#

All placeholders in the order in which they appear.

property required_placeholders#

All required placeholders in the order in which they appear.

class FormatApplier(translations, *, transformer=None)[source]#

Bases: Generic[NameType, SourceType, IdType]

Application of Format specifications.

Parameters:
  • translations – Matrix of ID translation components returned by fetchers.

  • transformer – Initialized Transformer instance.

Raises:

ValueError – If default is given and any placeholder names are missing.

property placeholders#

Return placeholder names in sorted order.

property records#

Records used by this instance.

property source#

Return translation source.

to_dict()[source]#

Get the underlying data used for translations as a dict.

Returns:

A dict {placeholder: [values...]}.

to_pandas()[source]#

Get the underlying data used for translations as a pandas.DataFrame.

property transformer#

Get the Transformer instance (or None) used by this FormatApplier.

class MagicDict(real_translations, default_value='<Failed: id={!r}>', enable_uuid_heuristics=True, transformer=None)[source]#

Bases: MutableMapping[IdType, str]

Dictionary type for translated IDs.

A dict-like mapping which returns “real” values if present in a backing dict. Values for unknown keys are generated using the default_value.

Parameters:
  • real_translations – A dict holding real translations.

  • default_value – A string with exactly one or zero placeholders.

  • enable_uuid_heuristics – Enabling may improve matching when UUID-like IDs are in use.

  • transformer

    Initialized Transformer instance.

Examples

Similarities with the built-in dict

>>> magic = MagicDict({1999: "Sofia", 1991: "Richard"})

Iteration, equality, and length are based on the real values.

>>> magic
{1999: 'Sofia', 1991: 'Richard'}
>>> len(magic)
2
>>> list(magic)
[1999, 1991]
>>> magic.real == magic
True
>>> magic == {1999: "Sofia"}  # Element missing
False

As you’d expect, casting to a regular dict removes all special handling.

Differences from the built-in dict

Methods __getitem__ and __contains__ never fail or return False. Using a default with get will generate a value rather than using the provided default.

>>> magic[1999]
'Sofia'
>>> magic[2019]
'<Failed: id=2019>'
>>> magic.get(2019, "foo")  
'<Failed: id=2019>'

ID translation heuristics

Special handling for uuid.UUID and UUID-like strings improve matching.

>>> string_uuid = "550e8400-e29b-41d4-a716-446655440000"
>>> magic = MagicDict(
...     {string_uuid: "Found!"},
...     enable_uuid_heuristics=True,
... )
>>> magic
{UUID('550e8400-e29b-41d4-a716-446655440000'): 'Found!'}

When enable_uuid_heuristics=True is set, the MagicDict will attempt to cast “promising” keys to uuid.UUID.

>>> from uuid import UUID
>>> magic[string_uuid], magic[UUID(string_uuid)]
('Found!', 'Found!')

Keys that cannot be converted are left as-is.

>>> magic["Hello"] = "World!"
>>> magic["unknown"], magic["Hello"]
("<Failed: id='unknown'>", 'World!')

To further customize ID matching behaviour, refer to the Transformer interface.

LOGGER = <Logger id_translation.offline.MagicDict (WARNING)>#
property default_value#

Return the default string value to return for unknown keys, if any.

get(_MagicDict__key, /, _=None)[source]#

Same as __getitem__.

Values for missing keys are generated from default_value.

property real#

Returns the backing dict.

class TranslationMap(source_translations, *, fmt='{id}:{name}', default_fmt='<Failed: id={id!r}>', name_to_source=None, default_fmt_placeholders=None, enable_uuid_heuristics=True, transformers=None)[source]#

Bases: Generic[NameType, SourceType, IdType], HasSources[SourceType], Mapping[Union[NameType, SourceType], MagicDict[IdType]]

Storage class for fetched translations.

Parameters:
  • source_translations – Fetched translations {source: PlaceholderTranslations}.

  • name_to_source – Mappings {name: source}, but may be overridden by the user.

  • fmt – A translation format. Must be given to use as a mapping.

  • default_fmt – Alternative format specification to use instead of fmt for fallback translation.

  • default_fmt_placeholders – Per-source default placeholder values.

  • enable_uuid_heuristics – Enabling may improve matching when UUID-like IDs are in use.

  • transformers – A dict {source: transformer} of initialized Transformer instances.

apply(name_or_source, fmt=None, *, default_fmt=None)[source]#

Create translations for a given name or source.

Parameters:
  • name_or_source – A name or source to translate.

  • fmtFormat to use. If None, fall back to init format.

  • default_fmt – Alternative format for default translation. Resolution: Arg -> init arg, fmt arg, init fmt arg

Returns:

Translations for name as a dict {id: translation}.

Raises:
  • ValueError – If fmt=None and initialized without fmt.

  • KeyError – If trying to translate name which is not known.

Notes

This method is called by __getitem__.

copy()[source]#

Make a copy of this TranslationMap.

property default_fmt#

Return the format specification to use instead of fmt for fallback translation.

property default_fmt_placeholders#

Return the default translations used for default_fmt_placeholders placeholders.

property enable_uuid_heuristics#

Return automatic UUID mitigation status.

property fmt#

Return the translation format.

classmethod from_pandas(frames, fmt='{id}:{name}', *, default_fmt='<Failed: id={id!r}>')[source]#

Create a new instance from a pandas.DataFrame dict.

Parameters:
  • frames – A dict {source: DataFrame}.

  • fmt – A translation format. Must be given to use as a mapping.

  • default_fmt – Alternative format specification to use instead of fmt for fallback translation.

Returns:

A new TranslationMap.

property name_to_source#

Return name-to-source mapping.

property names#

Return names that can be translated.

property placeholders#

Placeholders for all known Source names, such as id or name.

These are the (possibly unmapped) placeholders that may be used for translation.

Returns:

A dict {source: [placeholders..]}.

property reverse_mode#

Return reversed mode status flag.

If set, the mappings returned by apply() (and therefore also __getitem__) are reversed.

Returns:

Reversal status flag.

property sources#

A list of known Source names, such as cities or languages.

to_dicts()[source]#

Get the underlying data used for translations as dicts.

This is equivalent using to_pandas(), then calling DataFrame.to_dict(orient='list') on each frame.

Returns:

A dict {source: {placeholder: [values...]}}.

to_pandas()[source]#

Get the underlying data used for translations as pandas.DataFrame.

Returns:

A dict {source: DataFrame}.

to_translations(fmt=None)[source]#

Create translations for all sources.

Returned values are of type MagicDict. To convert to regular built-in dicts, run

translations = translation_map.to_translations()
as_regular_dicts = {
   source: dict(magic)
   for source, magic in translations.items()
}

on the returned dict-of-magic-dicts.

Parameters:

fmtFormat to use. If None, fall back to init format.

Returns:

A dict of translations {source: MagicDict}.

property transformers#

Get a dict {source: transformer} of Transformer instances used by this TranslationMap.

Modules

id_translation.offline.parse_format_string

Utility module for parsing raw Format input strings.

id_translation.offline.types

Types used for offline translation.