Record Pair Network

Record pair network module.

This module contains functionality for instantiating, training, and using a record

pair networks.

class neer_match.record_pair_network.RecordPairNetwork(similarity_map, initial_feature_width_scales=10, feature_depths=2, initial_record_width_scale=10, record_depth=4, **kwargs)

Record network class.

The class creates networks for matching records from two datasets. The networks consist of field pair networks, constructed according to the passed similarity map, that are concatenated and passed through a series of hidden dense layers. The output layer has a sigmoid activation function.

The depth and width of the record pair hidden layers is specified by the initial_record_width_scale and record_depth parameters. The width of the initial hidden layers is calculated multiplying the initial_record_width_scale by the number of field pairs (i.e., the number of associations in the similarity map, see no_associations()). The widths of the subsequent hidden layers are calculated dividing the initial width by the layer depth-index plus one.

The depth and width of the field pair networks are specified by the initial_feature_width_scales and feature_depths parameters (see FieldPairNetwork for more details).

similarity_map

The similarity map object.

Type:

SimilarityMap

initial_feature_width_scales

The initial width scales of the hidden layers for each field pair network.

Type:

list[int]

feature_depths

The depths of the networks for each field pair network.

Type:

list[int]

initial_record_width_scale

The initial width scale of the hidden layers for the record pair network.

Type:

int

record_depth

The depth of the record pair network.

Type:

int

__init__(similarity_map, initial_feature_width_scales=10, feature_depths=2, initial_record_width_scale=10, record_depth=4, **kwargs)

Initialize a record network object.

Parameters:
  • similarity_map (SimilarityMap) – The similarity map.

  • initial_feature_width_scales (Union[int, list[int]]) – The initial width scales of the hidden layers for each field pair network. If an integer is passed, the same scale is used for all networks.

  • feature_depths (Union[int, list[int]]) – The depths of the networks for each field pair network. If an integer is passed, the same depth is used for all networks.

  • initial_record_width_scale (int) – The initial width scale of the hidden layers for the record pair network.

  • record_depth (int) – The depth of the record pair network.

  • **kwargs – Additional keyword arguments passed to parent class (tensorflow.keras.Model).

build(input_shapes)

Build the network.

Return type:

None

call(inputs)

Run the network on input.

Return type:

Tensor

get_config()

Return the configuration of the network.

Return type:

dict