detoxai.core

Submodules

detoxai.core.evaluation submodule

detoxai.core.evaluation.evaluate_model(model: Module, dataloader: DataLoader, pareto_metrics: list[str] | None = None, device: str | None = None) dict[source]

Evaluate the model on various metrics

Parameters:
  • model (-) – Model to evaluate

  • dataloader (-) – DataLoader for the dataset

  • pareto_metrics (-) – List of metrics to include in the pareto front

  • device (-) – Device to use for evaluation (“cpu” or “cuda”)

Returns:

detoxai.core.interface submodule

detoxai.core.interface.parse_methods_config(methods_config: dict) dict[source]

Here we compare what was passed and overwrite the default configuration

Parameters:

methods_config – dict:

Returns:

detoxai.core.interface.debias(model: Module, dataloader: DetoxaiDataLoader | DataLoader, methods: list[str] | str = 'all', metrics: list[str] | str = 'all', methods_config: dict = {}, pareto_metrics: list[str] = ['balanced_accuracy', 'equalized_odds'], return_type: str = 'all', device: str = 'cpu', include_vanila_in_results: bool = True, test_dataloader: DetoxaiDataLoader | DataLoader = None, num_of_classes: int | None = None) CorrectionResult | dict[str, CorrectionResult][source]

Run a suite of correction methods on the model and return the results

Parameters:
  • model – Model to run the correction methods on

  • dataloader – DetoxaiDataLoader object with the dataset

  • harmful_concept – Concept to debias – this is the protected attribute # NOT SUPPORTED YET

  • methods – List of correction methods to run

  • metrics – List of metrics to include in the configuration

  • methods_config – Configuration for each correction method

  • pareto_metrics – List of metrics to use for the pareto front and selection of best method

  • return_type (optional) – Type of results to return. Options are ‘pareto-front’, ‘all’, ‘best’ “pareto-front”: Return the results CorrectionResult objects only for results on the pareto front “all”: Return the results for all correction methods “best”: Return the results for the best correction method, chosen with ideal point method from pareto front

  • device (optional) – Device to run the correction methods on

  • include_vanila_in_results (optional) – Include the vanilla model in the results

  • test_dataloader (optional) – DataLoader for the test dataset. If not provided, the original dataloader is used

  • num_of_classes (optional) – Number of classes in the dataset. Default is None, which means the number of classes will be inferred from the dataloader

detoxai.core.interface.run_correction(method: str, method_kwargs: dict, pareto_metrics: list[str] | None = None) CorrectionResult[source]

Run the specified correction method

Parameters:
  • method – Correction method to run

  • kwargs – Arguments for the correction method

  • method – str:

  • method_kwargs – dict:

  • pareto_metrics – list[str] | None: (Default value = None)

Returns:

detoxai.core.interface.get_supported_methods() list[str][source]

Get a list of supported methods

Returns:

List of supported methods

Return type:

list[str]

detoxai.core.interface_helpers submodule

detoxai.core.interface_helpers.load_supported_tags() dict[source]

From ./datasets/catalog/<dataset_name>/labels_mapping.yaml, load the dicts.

detoxai.core.interface_helpers.construct_metrics_config(metrics: list[str] | str = 'all', types: str = 'GAP') dict[source]

Construct the metrics configuration for the fairness and performance metrics

Parameters:
  • metrics – List of metrics to include in the configuration

  • types – Type of metric to use. Options are “GAP” or “RATIO”

  • metrics – list[str] | str: (Default value = “all”)

  • types – str: (Default value = “GAP”)

Returns:

detoxai.core.interface_helpers.resolve_layer(model, layer) Module | None[source]

Resolve a layer name to a layer in the model

Parameters:
  • model

  • layer

Returns:

detoxai.core.interface_helpers.infer_layers(corrector, layers: list[str] | str) list[str][source]

Infer the layers to use for the correction method

There are wildcards available: - ‘last’: Use the last layer - ‘penultimate’: Use the penultimate layer Otherwise, a list of actual layer names can be passed

Parameters:
  • corrector – Correction method object

  • layers – Layer specification

Returns:

detoxai.core.mcda_helpers submodule

detoxai.core.mcda_helpers.is_pareto_efficient(costs: ndarray, return_mask: bool = True) ndarray[source]

Find the pareto-efficient points

Parameters:
  • costs – An (n_points, n_costs) array

  • return_mask – True to return a mask

  • costs – np.ndarray:

  • return_mask – bool: (Default value = True)

Returns:

An array of indices of pareto-efficient points. If return_mask is True, this will be an (n_points, ) boolean array Otherwise it will be a (n_efficient_points, ) integer array of indices.

Credit: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python

detoxai.core.mcda_helpers.filter_pareto_front(results: dict[str, CorrectionResult]) dict[str, CorrectionResult][source]

Filter the results to only include those on the pareto front

Parameters:
  • results – List of CorrectionResult objects to filter

  • results – list[CorrectionResult]:

Returns:

detoxai.core.mcda_helpers.select_best_method(results: dict[str, CorrectionResult]) CorrectionResult[source]

Select the best correction method from the results using the ideal point method

Parameters:
  • results – List of CorrectionResult objects to choose from

  • results – list[CorrectionResult]:

Returns:

detoxai.core.model_wrappers submodule

class detoxai.core.model_wrappers.BaseLightningWrapper(model: ~torch.nn.modules.module.Module, criterion: ~torch.nn.modules.module.Module | None = CrossEntropyLoss(), performance_metrics: ~torchmetrics.collections.MetricCollection | None = None, learning_rate: float | None = 0.001, optimizer: ~torch.optim.optimizer.Optimizer | None = <class 'torch.optim.adam.Adam'>)[source]

Bases: LightningModule

training_step(batch, batch_idx)[source]
Parameters:
  • batch

  • batch_idx

Returns:

on_train_batch_end(outputs, batch, batch_idx)[source]
Parameters:
  • outputs

  • batch

  • batch_idx

Returns:

on_train_epoch_end()[source]
test_step(batch, batch_idx)[source]
Parameters:
  • batch

  • batch_idx

Returns:

on_test_batch_end(outputs, batch, batch_idx)[source]
Parameters:
  • outputs

  • batch

  • batch_idx

Returns:

on_test_epoch_end()[source]
configure_optimizers()[source]
forward(x)[source]
Parameters:

x

Returns:

predict_step(batch)[source]
Parameters:

batch

Returns:

class detoxai.core.model_wrappers.FairnessLightningWrapper(model: ~torch.nn.modules.module.Module, criterion: ~torch.nn.modules.module.Module | None = CrossEntropyLoss(), performance_metrics: ~torchmetrics.collections.MetricCollection | None = None, fairness_metrics: ~torchmetrics.collections.MetricCollection | None = None, learning_rate: float | None = 0.001, optimizer: ~torch.optim.optimizer.Optimizer | None = <class 'torch.optim.adam.Adam'>)[source]

Bases: BaseLightningWrapper

training_step(batch, batch_idx)[source]
Parameters:
  • batch

  • batch_idx

Returns:

on_train_batch_end(outputs, batch, batch_idx)[source]
Parameters:
  • outputs

  • batch

  • batch_idx

Returns:

on_train_epoch_end()[source]
test_step(batch, batch_idx)[source]
Parameters:
  • batch

  • batch_idx

Returns:

on_test_batch_end(outputs, batch, batch_idx)[source]
Parameters:
  • outputs

  • batch

  • batch_idx

Returns:

on_test_epoch_end()[source]
predict_step(batch, batch_idx, dataloader_idx=None)[source]
Parameters:
  • batch

  • batch_idx

  • dataloader_idx – (Default value = None)

Returns:

detoxai.core.results_class submodule

class detoxai.core.results_class.CorrectionResult(method: str, model: BaseLightningWrapper, metrics: dict)[source]

Bases: object

get_all_metrics() dict[source]
get_metric(metric: str) float[source]
Parameters:

metric – str:

Returns:

get_model() BaseLightningWrapper[source]
get_method() str[source]

detoxai.core.xai submodule

class detoxai.core.xai.XAIMetricsCalculator(dataloader: DetoxaiDataLoader, lrphandler: LRPHandler)[source]

Bases: object

calculate_metrics(model: Module, rect_pos: tuple[int, int], rect_size: tuple[int, int], vanilla_model: Module = None, sailmap_metrics: list[str] = ['RRF', 'HRF', 'MRR', 'DET', 'ADR', 'DIF', 'RDDT'], batches: int = 2, condition_on: str = 'proper_label', verbose: bool = False, neutral_point: float = 0.5, abs_on_neutral: bool = True) dict[str, float][source]

Calculate the metrics for the given model and sailmaps

Parameters:
  • model – nn

  • rect_pos – tuple

  • rect_size – tuple

  • vanilla_model – nn

  • sailmap_metrics – list

  • batches – int

  • condition_on – str

  • verbose – bool

  • model – nn.Module:

  • rect_pos – tuple[int:

  • int]

  • rect_size – tuple[int:

  • vanilla_model – nn.Module: (Default value = None)

  • sailmap_metrics – list[str]:

  • batches – int: (Default value = 2)

  • condition_on – str: (Default value = ConditionOn.PROPER_LABEL.value)

  • verbose – bool: (Default value = False)

  • source_range (#) – tuple[float:

  • float] – (Default value = (0))

  • neutral_point – float: (Default value = 0.5)

  • abs_on_neutral – bool: (Default value = True)

Returns:

The calculated metrics where the key is the metric name and the value is the calculated metric

Return type:

  • dict[str, float]

class detoxai.core.xai.SailRectMetric[source]

Bases: ABC

calculate_batch(sailmaps: ndarray, rect_pos: tuple[int, int], rect_size: tuple[int, int], ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]

Calculate the metric for a single batch of sailmaps

Parameters:
  • sailmaps – np.ndarray:

  • rect_pos – tuple[int:

  • int]

  • rect_size – tuple[int:

  • ret_format – tuple[str]: (Default value = (“mean”)

  • "std")

Returns:

reduce(ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]

Calculate the metric for already aggregated sailmaps

Parameters:
  • ret_format – tuple[str]: (Default value = (“mean”)

  • "std")

Returns:

aggregate(sailmaps: ndarray, rect_pos: tuple[int, int], rect_size: tuple[int, int], vanilla_sailmaps: ndarray = None)[source]

Aggregate sailmaps for later calculation

Parameters:
  • sailmaps – np.ndarray:

  • rect_pos – tuple[int:

  • int]

  • rect_size – tuple[int:

  • vanilla_sailmaps – np.ndarray: (Default value = None)

Returns:

structure_output(per_sample: ndarray[float], ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]
Parameters:
  • per_sample – np.ndarray[float]:

  • ret_format – tuple[str]: (Default value = (“mean”)

  • "std")

Returns:

class detoxai.core.xai.RRF(**kwargs)[source]

Bases: SailRectMetric

Rectangle Relevance Fraction

egin{equation} mathbf{RRF} =

rac{displaystyle sum_{(i,j) in R} p_{ij}}{displaystyle sum_{i = 1}^N sum_{j = 1}^M p_{ij}}

end{equation}

Here, $mathbf{RRF}$ measures the fraction of total relevance that falls within ROI.

Args:

Returns:

class detoxai.core.xai.HRF(epsilon: float = 0.05, **kwargs)[source]

Bases: SailRectMetric

subsection{High-Relevance Fraction (HRF)}

egin{equation} mathbf{HRF} = displaystyle

rac{1}{ ert R ert} sum_{(i,j) in R} mathbbm{1}_{{p_{ij} > epsilon}}

end{equation}

$mathbf{HRF}$ quantifies the proportion of pixels inside the ROI whose relevance exceeds a predefined threshold $epsilon$, indicating how many pixels are highly important for prediction.

Args:

Returns:

class detoxai.core.xai.MRR(**kwargs)[source]

Bases: SailRectMetric

subsection{Mean Relevance Ratio (MRR)}

egin{equation}

mathbf{MRR} =

rac{displaystyle rac{1}{ ert R ert} sum_{(i,j) in R} p_{ij}}{displaystyle rac{1}{N M - ert R ert} sum_{(i,j) otin R} p_{ij}},

end{equation} $mathbf{MRR}$ quantifies the ratio of the mean pixel value inside the ROI to the mean pixel value outside it. $mathbf{MRR} = 1$ indicates that the mean values are equal, while $mathbf{MRR} > 1$ says the mean pixel within the ROI has a higher intensity.

Args:

Returns:

class detoxai.core.xai.DET(**kwargs)[source]

Bases: SailRectMetric

subsection{Distribution Equivalence Testing (DET)}

The goal of the statistical test is to determine whether the pixels extit{inside} the rectangle have higher intensity than those extit{outside} the rectangle. Since the number of pixels and their intensity distributions inside and outside the ROI can vary, a non-parametric, unpaired statistical Mann-Whitney-Wilcoxon test is used. This permutation test assesses whether the intensity values from one group (inside) tend to be higher than those from the other (outside).

The null hypothesis $H_0$ for the test is that the intensity distributions inside and outside the rectangle are equal: egin{equation} egin{split}

H_0: F_{ ext{inside}}(x) &= F_{ ext{outside}}(x) H_1: F_{ ext{inside}}(x) &> F_{ ext{outside}}(x)

end{split} end{equation}

To perform the test, all pixel intensities are ranked, and the sum of ranks for each group (inside and outside the ROI) is computed. The test then evaluates the probability that the intensity values inside the rectangle are statistically higher than those outside. The final outcome of the DET is a binary decision: extbf{TRUE} indicates that the null hypothesis is rejected (i.e., there is statistically significant evidence that the pixels inside the rectangle have higher intensity), while extbf{FALSE} signifies that we fail to reject the null hypothesis, meaning that the evidence is inconclusive regarding a higher intensity inside the rectangle.

Args:

Returns:

reduce(ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]

Calculate the metric for already aggregated sailmaps

Parameters:
  • ret_format – tuple[str]: (Default value = (“mean”)

  • "std")

Returns:

class detoxai.core.xai.ADR(**kwargs)[source]

Bases: SailRectMetric

Average Difference in Region (ADR)

ADR measures the mean pixel-wise difference between vanilla and debiased saliency maps within the region of interest (ROI). A positive value indicates that vanilla saliency values are generally higher than debiased ones in the region.

Args:

Returns:

class detoxai.core.xai.DIF(eps: float = 0.001, **kwargs)[source]

Bases: SailRectMetric

Decreased Intensity Fraction (DIF)

DIF measures the ratio of pixels showing decreased intensity in the debiased model compared to the vanilla model. It represents the fraction of pixels inside a rectangle that significantly flipped their saliency value.

Args:

Returns:

class detoxai.core.xai.RDDT(**kwargs)[source]

Bases: SailRectMetric

Rectangle Difference Distribution Testing (RDDT)

Performs a Wilcoxon signed rank test to determine if pixels from the vanilla model have significantly higher intensity than those from the debiased model within the ROI. Returns 1 if the test rejects the null hypothesis (indicating vanilla has higher intensity), 0 otherwise.

Args:

Returns:

reduce(ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]
Parameters:
  • ret_format – tuple[str]: (Default value = (“mean”)

  • "std")

Returns:

Module contents