Logging#

All messages are emitted by id_translation namespace loggers. The id_translation namespace uses the logging.WARNING log level (similar to SQLAlchemy). To enable logs at the ℹ️ INFO level and below, you must explicitly configure the log level.

Verbose logging#

Set the global ENABLE_VERBOSE_LOGGING flag (or use enable_verbose_debug_messages()) to enable additional 🪲 DEBUG-level messages. Use the ID_TRANSLATION_VERBOSE variable in containers.

Warning

Verbose logging may emit hundreds of messages for a single translation task!

Note that verbose and 🪲 DEBUG logging are different things; verbose logging can emit hundreds of messages in cases where regular 🪲 DEBUG logging would only emit a dozen. Verbose messages are typically related to the mapping process.

Example#

The ℹ️ INFO-level messages emitted for a single Translator.translate() class.

[id_translation.fetching]: Finished initialization of 'MemoryFetcher' in 85 μs:
    MemoryFetcher(sources=['category'], return_all=True)

[id_translation.fetching]: Finished initialization of 'SqlFetcher' in 106 ms:
    SqlFetcher('postgresql+pg8000://postgres:***@localhost:5002/sakila',
    sources=['address', 'film', 'language', 'customer', 'country', 'staff',
    'actor', 'city'])

[id_translation.Translator.map]: Finished mapping of 4/6 names in 'DataFrame' in
    2 ms: {'customer_id': 'customer', 'film_id': 'film', 'category_id':
    'category', 'staff_id': 'staff', 'rental_date': None, 'return_date': None}.

[id_translation.fetching]: Finished fetching from 1 sources in 1 ms: ['category'
    x ('id', 'name') x 16/7 IDs].

[id_translation.fetching]: Finished fetching from 3 sources in 13 ms:
    ['customer' x ('id', 'name', 'last_name') x 10/10 IDs], ['film' x ('id',
    'name') x 10/10 IDs], ['staff' x ('id', 'name', 'last_name') x 2/2 IDs].

[id_translation.Translator]: Finished translation of 29 unique IDs (4 names) in
    'DataFrame' in 18 ms.

Since these are 🔑 key event messages, there are corresponding entry events (the messages above all all 'exit'-records). The 'enter'-records, however, are only emitted on the 🪲 DEBUG level.

Key Event Records#

Key event messages are emitted at the boundaries of the various stages in the translation process (see the Translation primer and Mapping primer pages). Common fields are listed below.

Key Event Record Structure#

Field

Type

Description

task_id

int

Unique task identifier, e.g. for a single translate() call.

event_key

str = class.method:stage

E.g. MultiFetcher.fetch_all:enter (where stage='enter').

seconds

float

Task duration in seconds. Only when stage='exit'.

All key event methods add additional fields that are relevant to the current task and stage. Fields may be added, removed, or change values depending on the stage of the event.

Event-specific fields#

Let’s take closer look at the final message. The LogRecord contains additional information that isn’t included in the message itself. The full Translator.translate:exit-record is shown as JSON below.

Translation exit event. Click here to download.#
10322  {
10323    "name": "id_translation.Translator",
10324    "msg": "Finished translation of 29 unique IDs (4 names) in 'DataFrame' in 18 ms.",
10325    "args": [],
10326    "levelname": "INFO",
10327    "levelno": 20,
10328    "pathname": "<site-packages>/id_translation/_tasks/_translate.py",
10329    "filename": "_translate.py",
10330    "module": "_translate",
10331    "exc_info": null,
10332    "exc_text": null,
10333    "stack_info": null,
10334    "lineno": 199,
10335    "funcName": "log_key_event_exit",
10336    "created": 1557606600.321977,
10337    "msecs": 920.0,
10338    "relativeCreated": 868.658388,
10339    "thread": 123299629801600,
10340    "threadName": "MainThread",
10341    "processName": "MainProcess",
10342    "process": 7743,
10343    "taskName": null,
10344    "task_id": 15214,
10345    "event_key": "Translator.translate:exit",
10346    "seconds": 0.017930595999985144,
10347    "durations_ms": {
10348      "map": 1.92,
10349      "extract": 0.134,
10350      "fetch": 14.205,
10351      "verify": 0.254,
10352      "insert": 1.177,
10353      "overhead": 0.24
10354    },
10355    "num_ids": {
10356      "customer": 10,
10357      "film": 10,
10358      "category": 7,
10359      "staff": 2
10360    },
10361    "online": true,
10362    "translatable_type": "pandas.DataFrame",
10363    "io_type": "id_translation.dio.integration.pandas.PandasIO",
10364    "copy": true,
10365    "reverse": false,
10366    "name_to_source": {
10367      "staff_id": "staff",
10368      "customer_id": "customer",
10369      "category_id": "category",
10370      "film_id": "film"
10371    },
10372    "message": "Finished translation of 29 unique IDs (4 names) in 'DataFrame' in 18 ms.",
10373    "asctime": "2019-05-11T20:30:00"
10374  }

About 350 messages were emitted since ENABLE_VERBOSE_LOGGING=True. If we were using regular 🪲 DEBUG-logging, about 40 messages would have been emitted instead. The vast majority of the verbose messages relate to mapping and fetcher initialization.

Dummy version of the code that produced the the records.#
from logging import basicConfig, DEBUG
from id_translation import Translator
from id_translation.logging import enable_verbose_debug_messages

basicConfig(handlers=[SomeJsonExporter()])

translator = Translator.from_config(main_config, [fetcher_configs..])
with enable_verbose_debug_messages():
    translator.translate(df)

The configs that were used are available here.