Similarity Map¶
Similarity mappings module.
The module provides functionality to store and manage a similarity mappings between records of two datasets.
- class neer_match.similarity_map.SimilarityMap(instructions)¶
Similarity map class.
The class stores a collection of associations between the records of two datasets.
- instructions¶
The similarity map instructions.
- Type:
dict
- lcols¶
The left columns.
- Type:
list[str]
- rcols¶
The right columns.
- Type:
list[str]
- sims¶
The similarity functions.
- Type:
list[str]
- __getitem__(index)¶
Return the item at the given index.
- Return type:
Tuple
[str
,str
,str
]
- __init__(instructions)¶
Initialize a similarity map object.
- Parameters:
instructions – The similarity map instructions.
- __iter__()¶
Iterate over the similarity map.
- Return type:
Iterator
- __len__()¶
Return the number of items in the similarity map.
- Return type:
int
- __str__()¶
Return a string representation of the similarity map.
- Return type:
str
- association_names()¶
Return a unique name for each association in the similarity map.
- Return type:
List
[str
]
- association_offsets()¶
Return association offsets.
Return the starting column offset of each association in the similarity matrix
- Return type:
List
[int
]
- association_sizes()¶
Return then number of similarities used by each association.
- Return type:
List
[int
]
- keys()¶
Return a unique key for each similarity map entry.
Combine association with similarity names and return them.
- Return type:
List
[str
]
- no_associations()¶
Return the number of associations of the map.
- Return type:
int
- neer_match.similarity_map.available_similarities()¶
Return the list of available similarities.
- Return type:
Dict
[str
,Callable
]
- neer_match.similarity_map.discrete(x, y)¶
Discrete similarity function.
- Return type:
float
- neer_match.similarity_map.euclidean(x, y)¶
Euclidean similarity function.
- Return type:
float
- neer_match.similarity_map.gaussian(x, y)¶
Gaussian similarity function.
- Return type:
float