Reasoning¶

Reasoning module.

This module provides reasoning functionality for the entity matching tasks using logic tensor networks.

class neer_match.reasoning.RefutationModel(similarity_map, initial_feature_width_scales=10, feature_depths=2, initial_record_width_scale=10, record_depth=4, **kwargs)¶

A neural-symbolic refutation model for entity matching tasks.

Inherits neer_match.matching_model.NSMatchingModel and provides additional functionality for refutation logic. The built-in refutation logic allows one to refute the significance of one or more similarities of a conjectured association in detecting entity matches.

fit(left, right, matches, epochs, refutation, penalty_threshold=0.95, penalty_scale=1.0, penalty_decay=0.1, satisfiability_weight=1.0, verbose=1, log_mod_n=1, **kwargs)¶

Fit the refutation model.

Construct a data generator and an axiom generator from the input data and use the model’s similarity map to fit the model while trying to refute the refutation claim.

In the default case of satisfiability weight equal to 1, the function minimizes the satisfiability of the refutation claim while penalizing the satisfiability of the matching axioms below the penalty threshold. If the satisfiability weight is less than 1, the model is trained to optimize the satisfiability of the refutation claim, while penalizing a weighted sum of the satisfiability of the matching axioms and the binary cross entropy loss for values below the penalty threshold.

The penalty threshold sets tolerance for the matching axioms (and/or the binary cross entropy loss) below which the penalty is applied. The penalty scale sets the linear scale of the penalty when the threshold is not crossed. The penalty decay sets the exponential decay of the penalty when the threshold is crossed. The linear and exponential parts are combined using the tensorflow.keras.activations.elu() function.

Parameters:

left – The left dataset.
right – The right dataset.
matches – The matches dataset.
epochs – The number of epochs to train the model.
refutation – The refutation claim. It can be a string or a dictionary. If it is a string, it is assumed to be an association name. If it is a dictionary, the keys are association names and the values are similarity names. If the value is None, all similarities in the association are used.
penalty_threshold – The penalty threshold.
penalty_scale – The non-satisfiability scale.
penalty_decay – The non-satisfiability decay.
satisfiability_weight – The satisfiability weight.
verbose – The verbosity level.
log_mod_n – The logging frequency.
**kwargs – Additional arguments to the data generator.