Similarity Map

Similarity mappings module.

The module provides functionality to store and manage a similarity mappings between records of two datasets.

class neer_match.similarity_map.SimilarityMap(instructions)

Similarity map class.

The class stores a collection of associations between the records of two datasets.

instructions

The similarity map instructions.

Type:

dict

lcols

The left columns.

Type:

list[str]

rcols

The right columns.

Type:

list[str]

sims

The similarity functions.

Type:

list[str]

__getitem__(index)

Return the item at the given index.

Return type:

Tuple[str, str, str]

__init__(instructions)

Initialize a similarity map object.

Parameters:

instructions – The similarity map instructions.

__iter__()

Iterate over the similarity map.

Return type:

Iterator

__len__()

Return the number of items in the similarity map.

Return type:

int

__str__()

Return a string representation of the similarity map.

Return type:

str

association_names()

Return a unique name for each association in the similarity map.

Return type:

List[str]

association_offsets()

Return association offsets.

Return the starting column offset of each association in the similarity matrix

Return type:

List[int]

association_sizes()

Return then number of similarities used by each association.

Return type:

List[int]

keys()

Return a unique key for each similarity map entry.

Combine association with similarity names and return them.

Return type:

List[str]

no_associations()

Return the number of associations of the map.

Return type:

int

neer_match.similarity_map.available_similarities()

Return the list of available similarities.

Return type:

Dict[str, Callable]

neer_match.similarity_map.discrete(x, y)

Discrete similarity function.

Return type:

float

neer_match.similarity_map.euclidean(x, y)

Euclidean similarity function.

Return type:

float

neer_match.similarity_map.gaussian(x, y)

Gaussian similarity function.

Return type:

float