Logging#

The translation and mapping processes can output vast amounts of information that can be difficult to sift through manually, especially with loggers working on the DEBUG-level with extra verbose flags enabled. The exact flow of these processes are discussed in the Translation primer and Mapping primer pages. If you haven’t looked at them already, you may want to do so before continuing.

Key Event Records#

Key event records are emitted on the INFO and DEBUG level, see the table below. These messages are structured for ingestion and will always contain the event_key-key, as well as some other context-dependent data.

Key Events#

Function / log level

Event key

Domain-specific keys

TRANSLATOR.TRANSLATE

translatable_type, names, sources

DEBUG-level event.

TRANSLATOR.MAP

translatable_type, value, candidates

DEBUG-level event.

MULTIFETCHER.FETCH

max_workers, num_fetchers, sources, placeholders

DEBUG-level event.

MULTIFETCHER.FETCH_ALL

max_workers, num_fetchers, placeholders

<CLASSNAME>.MAP_PLACEHOLDERS

value, candidates, context

<CLASSNAME>.FETCH_TRANSLATIONS

fetch_all, source, placeholders, num_ids

Translation issues tend to fall within one of these domains. To modify the levels, update the KeyEventLogLevel tuples in the id_translation.settings.logging class.

Hint

All event_key-records also have an event_stage value. The delimiting values are 'ENTER' (the event just started) and 'EXIT' (the event just finished), respectively. This allows selecting other log records that are part of the same process that initially emitted the event key.

Finding the first event_stage='ENTER'-record without a corresponding 'EXIT'-record is usually a good place to start when something goes wrong!

Sample record#

A single Translator.translate()-call with every single debug option enabled typically produces a few hundred records (about 300 in this case). The final TRANSLATOR.TRANSLATE.EXIT-event is the only ℹ️INFO-level message, shown in its entirety as a JSON-record at the bottom.

Exit message of the Translator.translate()-method.#
Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-
  source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id':
  'category', 'staff_id': 'staff'}.

The event keys aren’t included in any user-facing messages, but are useful since they may be used for indexing by log ingestion frameworks. The event keys associated with the message above are shown in the next snippet. Metadata such as the names the user wanted to translate is included as well. Here, that value is null, indicating that the user wanted automatic name selection based on the configuration. This field is populated by the names that were extracted by the Translator in the EXIT-record.

Event keys emitted when entering the Translator.translate()-method.#
    "event_key": "TRANSLATOR.TRANSLATE",
    "event_stage": "ENTER",
    "event_title": "TRANSLATOR.TRANSLATE.ENTER",
    "sources": [
      "category",
      "address",
      "film",
      "language",
      "customer",
      "country",
      "staff",
      "actor",
      "city"
    ],
    "online": true,
    "translatable_type": "pandas.core.frame.DataFrame",
    "names": null,

The exact data that is included varies naturally depending on the message. Any record with event_stage='EXIT' will include a execution_time-value in seconds. Messages related to mapping will contain (values, candidates, context), and so on.

A TRANSLATOR.TRANSLATE.EXIT-record emitted on the INFO-level.#
5789  {
5790    "name": "id_translation.Translator.translate",
5791    "msg": "Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id': 'category', 'staff_id': 'staff'}.",
5792    "args": [],
5793    "levelname": "INFO",
5794    "levelno": 20,
5795    "pathname": "<site-packages>/id_translation/_tasks/_translate.py",
5796    "filename": "_translate.py",
5797    "module": "_translate",
5798    "exc_info": null,
5799    "exc_text": null,
5800    "stack_info": null,
5801    "lineno": 177,
5802    "funcName": "log_key_event_exit",
5803    "created": 1557606600.42005,
5804    "msecs": 512.0,
5805    "relativeCreated": 1091.9110774993896,
5806    "thread": 123868956586112,
5807    "threadName": "MainThread",
5808    "processName": "MainProcess",
5809    "process": 34525,
5810    "task_id": 8362323,
5811    "event_key": "TRANSLATOR.TRANSLATE",
5812    "event_stage": "EXIT",
5813    "event_title": "TRANSLATOR.TRANSLATE.EXIT",
5814    "execution_time": 0.038065849999838974,
5815    "duration_ms": {
5816      "map": 5.5,
5817      "extract": 0.1,
5818      "fetch": 30.5,
5819      "verify": 0.4,
5820      "insert": 1.6,
5821      "total": 38.1
5822    },
5823    "translatable_type": "pandas.core.frame.DataFrame",
5824    "copy": false,
5825    "reverse": false,
5826    "name_to_source": {
5827      "customer_id": "customer",
5828      "film_id": "film",
5829      "category_id": "category",
5830      "staff_id": "staff"
5831    },
5832    "message": "Finished translation of 4 names in 'DataFrame'-type data in 38ms, using name-to-source mapping: {'customer_id': 'customer', 'film_id': 'film', 'category_id': 'category', 'staff_id': 'staff'}."
5833  }

A few of the more interesting parts of the record have been highlighted. Click here to download. The line numbers shown above are the actual line numbers of this file.

Dummy version of the code that produced the the records.#
from id_translation import Translator
import logging
from id_translation.mapping.support import enable_verbose_debug_messages

logging.basicConfig(level=logging.DEBUG, handlers=[SomeJsonExporter()])

translator = Translator.from_config(main_config, [fetcher_configs..])
with enable_verbose_debug_messages():
    translator.translate(df)

The configs that were used are available here.