id_translation.mapping.matrix#

Functions and classes used by the Mapper for handling score matrices.

Warning

This module is considered an implementation detail, and may change without notice.

Classes

Record(value, candidate, score)

Data concerning a match.

Reject(record[, superseding_value, ...])

Data concerning the rejection of a match.

ScoreHelper(matrix, min_score[, logger, task_id])

High-level selection operations.

ScoreMatrix(values, candidates, *[, grid])

A matrix of match scores.

class Record(value, candidate, score)[source]#

Bases: Generic[ValueType, CandidateType]

Data concerning a match.

candidate#

A hashable candidate.

score#

Likeness score computed by some scoring function.

value#

A hashable value.

class Reject(record, superseding_value=None, superseding_candidate=None)[source]#

Bases: Generic[ValueType, CandidateType]

Data concerning the rejection of a match.

explain(min_score, full=False)[source]#

Create a string which explains the rejection.

Parameters:
  • min_score – Minimum score to accept a match.

  • full – If True show full information about superseding matches.

Returns:

An explanatory string.

record#

A Record to describe.

superseding_candidate = None#

A Record that prevents matching of the current candidate.

superseding_value = None#

A Record that prevents matching of the current value.

class ScoreHelper(matrix, min_score, logger=None, *, task_id=None)[source]#

Bases: Generic[ValueType, CandidateType]

High-level selection operations.

Parameters:
  • matrix – A ScoreMatrix instance.

  • min_score – Minimum score to make a value -> candidate match.

  • logger – Explicit Logger instance to use.

  • task_id – Used for logging.

above()[source]#

Records with scores above the threshold.

below()[source]#

Records with scores below the threshold.

property logger#

Return the Logger that is used by this instance.

to_directional_mapping(cardinality=None)[source]#

Create a DirectionalMapping with a given target Cardinality.

Parameters:

cardinality – Explicit cardinality to set, see cardinality. If None, use the actual cardinality when selecting all matches with scores at or above the minimum.

Returns:

A DirectionalMapping.

class ScoreMatrix(values, candidates, *, grid=None)[source]#

Bases: Generic[ValueType, CandidateType]

A matrix of match scores.

Parameters:
  • values – Iterable of elements to match to candidates.

  • candidates – Iterable of candidates to match with value. Duplicate elements will be discarded.

  • grid – Initial score matrix. Default is to fill with -inf.

Raises:

ValueError – If a bad grid is given.

property candidates#

Unique candidates in order.

get_finite_values()[source]#

Compute all finite values.

property size#

Total number of elements.

to_dict()[source]#

Convert to dict {(value, candidate): score}.

to_native_string(*, decimals=2, lines=True)[source]#

Format score table without pandas.

to_pandas()[source]#

Convert to pandas.DataFrame.

to_string(*, decimals=2)[source]#

Format score table.

property values#

Unique values in order.