Logging#
All messages are emitted by id_translation namespace loggers. The id_translation namespace uses the
logging.WARNING log level (similar to
SQLAlchemy).
To enable logs at the ℹ️ INFO level and below, you must explicitly configure the log level.
Verbose logging#
Set the global ENABLE_VERBOSE_LOGGING flag (or use enable_verbose_debug_messages()) to
enable additional 🪲 DEBUG-level messages. Use the ID_TRANSLATION_VERBOSE variable in containers.
Warning
Verbose logging may emit hundreds of messages for a single translation task!
Hint
Click here for sample output using style="rainbow".
Note that verbose and 🪲 DEBUG logging are different things; verbose logging can emit hundreds of messages in
cases where regular 🪲 DEBUG logging would only emit a dozen. Verbose messages are typically related to the
mapping process.
Example#
The ℹ️ INFO-level messages emitted for a single Translator.translate() class.
[id_translation.fetching]: Finished initialization of 'MemoryFetcher' in 96 μs:
MemoryFetcher(sources=['category'], return_all=True)
[id_translation.fetching]: Finished initialization of 'SqlFetcher' in 115 ms:
SqlFetcher('postgresql+pg8000://postgres:***@localhost:5002/sakila',
sources=['address', 'film', 'language', 'customer', 'country', 'staff',
'actor', 'city'])
[id_translation.Translator.map]: Finished mapping of 4/6 names in 'DataFrame' in
2 ms: {'customer_id': 'customer', 'film_id': 'film', 'category_id':
'category', 'staff_id': 'staff', 'rental_date': None, 'return_date': None}.
[id_translation.fetching]: Finished fetching from 1 sources in 1 ms: ['category'
x ('id', 'name') x 16/7 IDs].
[id_translation.fetching]: Finished fetching from 3 sources in 15 ms:
['customer' x ('id', 'name', 'last_name') x 10/10 IDs], ['film' x ('id',
'name') x 10/10 IDs], ['staff' x ('id', 'name', 'last_name') x 2/2 IDs].
[id_translation.Translator]: Finished translation of 29 unique IDs (4 names) in
'DataFrame' in 20 ms.
Since these are 🔑 key event messages, there are corresponding entry events (the messages
above all all 'exit'-records). The 'enter'-records, however, are only emitted on the 🪲 DEBUG level.
Key Event Records#
Key event messages are emitted at the boundaries of the various stages in the translation process (see the Translation primer and Mapping primer pages). Common fields are listed below.
Field |
Type |
Description |
|---|---|---|
task_id |
|
Unique task identifier, e.g. for a single |
event_key |
|
E.g. MultiFetcher.fetch_all:enter (where |
seconds |
|
Task duration in seconds. Only when |
All key event methods add additional fields that are relevant to the current task and stage. Fields may be added,
removed, or change values depending on the stage of the event.
Event-specific fields#
Let’s take closer look at the final message. The LogRecord contains additional information that isn’t
included in the message itself. The full Translator.translate:exit-record is shown as JSON below.
10432 {
10433 "name": "id_translation.Translator",
10434 "msg": "Finished translation of 29 unique IDs (4 names) in 'DataFrame' in 20 ms.",
10435 "args": [],
10436 "levelname": "INFO",
10437 "levelno": 20,
10438 "pathname": "<site-packages>/id_translation/_tasks/_translate.py",
10439 "filename": "_translate.py",
10440 "module": "_translate",
10441 "exc_info": null,
10442 "exc_text": null,
10443 "stack_info": null,
10444 "lineno": 199,
10445 "funcName": "log_key_event_exit",
10446 "created": 1557606600.350064,
10447 "msecs": 711.0,
10448 "relativeCreated": 952.588912,
10449 "thread": 128314142105728,
10450 "threadName": "MainThread",
10451 "processName": "MainProcess",
10452 "process": 15248,
10453 "taskName": null,
10454 "task_id": 15234,
10455 "event_key": "Translator.translate:exit",
10456 "seconds": 0.020187545999760914,
10457 "durations_ms": {
10458 "map": 2.12,
10459 "extract": 0.136,
10460 "fetch": 16.089,
10461 "verify": 0.294,
10462 "insert": 1.266,
10463 "overhead": 0.283
10464 },
10465 "num_ids": {
10466 "customer": 10,
10467 "film": 10,
10468 "category": 7,
10469 "staff": 2
10470 },
10471 "online": true,
10472 "translatable_type": "pandas.DataFrame",
10473 "io_type": "id_translation.dio.integration.pandas.PandasIO",
10474 "copy": true,
10475 "reverse": false,
10476 "name_to_source": {
10477 "staff_id": "staff",
10478 "customer_id": "customer",
10479 "category_id": "category",
10480 "film_id": "film"
10481 },
10482 "message": "Finished translation of 29 unique IDs (4 names) in 'DataFrame' in 20 ms.",
10483 "asctime": "2019-05-11T20:30:00"
10484 }
About 350 messages were emitted since ENABLE_VERBOSE_LOGGING=True. If we were using regular 🪲 DEBUG-logging,
about 40 messages would have been emitted instead. The vast majority of the verbose messages relate to mapping and
fetcher initialization.
from logging import basicConfig, DEBUG
from id_translation import Translator
from id_translation.logging import enable_verbose_debug_messages
basicConfig(handlers=[SomeJsonExporter()])
translator = Translator.from_config(main_config, [fetcher_configs..])
with enable_verbose_debug_messages():
translator.translate(df)
The configs that were used are available here.