Logging#
The translation and mapping processes can output vast amounts of information that can be difficult to sift through
manually, especially with loggers working on the DEBUG-level with extra verbose flags enabled. The exact flow of
these processes are discussed in the Translation primer and Mapping primer pages. If you haven’t looked at
them already, you may want to do so before continuing.
Key Event Records#
Key event records are emitted on the INFO and DEBUG level, see the table below. These messages are structured
for ingestion and will always contain the event_key-key, as well as some other context-dependent data.
Function / log level |
Event key |
Domain-specific keys |
|---|---|---|
|
translatable_type, names, sources |
|
DEBUG-level event. |
|
translatable_type, value, candidates |
DEBUG-level event. |
|
max_workers, num_fetchers, sources, placeholders |
DEBUG-level event. |
|
max_workers, num_fetchers, placeholders |
DEBUG-level event. |
|
value, candidates, context |
DEBUG-level event. |
|
fetch_all, source, placeholders, num_ids |
Translation issues tend to fall within one of these domains. To modify the levels, update the
KeyEventLogLevel tuples in the id_translation.settings.logging class.
Hint
All event_key-records also have an event_stage value. The delimiting values are 'ENTER' (the event just
started) and 'EXIT' (the event just finished), respectively. This allows selecting other log records that are
part of the same process that initially emitted the event key.
Finding the first event_stage='ENTER'-record without a corresponding 'EXIT'-record is usually a good place to
start when something goes wrong!
Sample record#
A single Translator.translate()-call with every single debug option enabled typically produces a few hundred
records (about 300 in this case). The final TRANSLATOR.TRANSLATE.EXIT-event is the only ℹ️INFO-level message,
shown in its entirety as a JSON-record at the bottom.
Translator.translate()-method.#Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-
source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id':
'category', 'staff_id': 'staff'}.
The event keys aren’t included in any user-facing messages, but are useful since they may be used for indexing by log
ingestion frameworks. The event keys associated with the message above are shown in the next snippet. Metadata such as
the names the user wanted to translate is included as well. Here, that value is null, indicating that the user
wanted automatic name selection based on the configuration. This field is populated by the names that were extracted by
the Translator in the EXIT-record.
Translator.translate()-method.# "event_key": "TRANSLATOR.TRANSLATE",
"event_stage": "ENTER",
"event_title": "TRANSLATOR.TRANSLATE.ENTER",
"sources": [
"category",
"address",
"film",
"language",
"customer",
"country",
"staff",
"actor",
"city"
],
"online": true,
"translatable_type": "pandas.core.frame.DataFrame",
"names": null,
The exact data that is included varies naturally depending on the message. Any record with event_stage='EXIT' will
include a execution_time-value in seconds. Messages related to mapping will contain
(values, candidates, context), and so on.
TRANSLATOR.TRANSLATE.EXIT-record emitted on the INFO-level.#5789 {
5790 "name": "id_translation.Translator.translate",
5791 "msg": "Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id': 'category', 'staff_id': 'staff'}.",
5792 "args": [],
5793 "levelname": "INFO",
5794 "levelno": 20,
5795 "pathname": "<site-packages>/id_translation/_tasks/_translate.py",
5796 "filename": "_translate.py",
5797 "module": "_translate",
5798 "exc_info": null,
5799 "exc_text": null,
5800 "stack_info": null,
5801 "lineno": 177,
5802 "funcName": "log_key_event_exit",
5803 "created": 1557606600.42005,
5804 "msecs": 512.0,
5805 "relativeCreated": 1091.9110774993896,
5806 "thread": 123868956586112,
5807 "threadName": "MainThread",
5808 "processName": "MainProcess",
5809 "process": 34525,
5810 "task_id": 8362323,
5811 "event_key": "TRANSLATOR.TRANSLATE",
5812 "event_stage": "EXIT",
5813 "event_title": "TRANSLATOR.TRANSLATE.EXIT",
5814 "execution_time": 0.038065849999838974,
5815 "duration_ms": {
5816 "map": 5.5,
5817 "extract": 0.1,
5818 "fetch": 30.5,
5819 "verify": 0.4,
5820 "insert": 1.6,
5821 "total": 38.1
5822 },
5823 "translatable_type": "pandas.core.frame.DataFrame",
5824 "copy": false,
5825 "reverse": false,
5826 "name_to_source": {
5827 "customer_id": "customer",
5828 "film_id": "film",
5829 "category_id": "category",
5830 "staff_id": "staff"
5831 },
5832 "message": "Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id': 'category', 'staff_id': 'staff'}."
5833 }
A few of the more interesting parts of the record have been highlighted. Click here
to download. The line numbers shown above are the actual line numbers of this file.
from id_translation import Translator
import logging
from id_translation.mapping.support import enable_verbose_debug_messages
logging.basicConfig(level=logging.DEBUG, handlers=[SomeJsonExporter()])
translator = Translator.from_config(main_config, [fetcher_configs..])
with enable_verbose_debug_messages():
translator.translate(df)
The configs that were used are available here.