Global Explainability¶

Global explainability module.

This module provides customized global explainability functionality for entity matching tasks.

neer_match.global_explainability.accumulated_local_effect(model, left, right, xkey, xvalue, centered=True, k=50)¶

Calculate the accumulated local effect of a key over a domain grid.

Creates a 0 to xvalue grid of n interpolation points, calculates local differences in the model’s predictions for each segment of the grid, and averages the differences. If centered is True, the partial dependence of the model on the key at xvalue is subtracted from the accumulated local effect.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
xkey (str) – The key for which to calculate the accumulated local effect.
xvalue (float) – The value of at which the accumulated local effect is calculated.
centered (bool) – Whether to center the accumulated local effect.
k (int) – The number of interpolation points

neer_match.global_explainability.accumulated_local_effect_plot(model, left, right, key, centered=True, n=50, k=50)¶

Plot the accumulated local effect of a key.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
key (str) – The key for which to calculate the accumulated local effect.
centered (bool) – Whether to center the accumulated local effect.
n (int) – The number of interpolation points for the figure.
k (int) – The number of interpolation points for the local effect.

neer_match.global_explainability.partial_dependence(model, left, right, key, n=50)¶

Calculate the partial dependence of a key over a domain grid.

Creates a [0, 1] grid of n interpolation points and calculates the partial dependence of the model on the key at each point.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
key (str) – The key for which to calculate the partial dependence.
n (int) – The number of interpolation points to use.

Return type:

ndarray

neer_match.global_explainability.partial_dependence_feature_importance(model, left, right, key, n=50)¶

Calculate the feature importance of a key using partial dependence.

Calculates the standard deviation of the partial dependence of the model on the key over a [0, 1] domain grid.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
key (str) – The key for which to calculate the feature importance.
n (int) – The number of interpolation points to use.

neer_match.global_explainability.partial_dependence_function(model, left, right, xfeatures)¶

Calculate the partial dependence of the model on the given keys.

Replaces the values of the given keys with the specified values and calculates the average prediction of the model.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
xfeatures (dict[str, float]) – A dictionary of the features with values where the partial dependence is calculated.

Return type:

float

neer_match.global_explainability.partial_dependence_plot(model, left, right, key, n=50)¶

Plot the partial dependence of a key.

Plots the partial dependence of the model on the key over a [0, 1] domain grid.

Parameters:

model (Union[DLMatchingModel, NSMatchingModel]) – The matching model to use.
left (DataFrame) – The left DataFrame.
right (DataFrame) – The right DataFrame.
key (str) – The key for which to calculate the partial dependence.
n (int) – The number of interpolation points