detoxai.core
Submodules
detoxai.core.evaluation submodule
- detoxai.core.evaluation.evaluate_model(model: Module, dataloader: DataLoader, pareto_metrics: list[str] | None = None, device: str | None = None) dict[source]
Evaluate the model on various metrics
- Parameters:
model (-) – Model to evaluate
dataloader (-) – DataLoader for the dataset
pareto_metrics (-) – List of metrics to include in the pareto front
device (-) – Device to use for evaluation (“cpu” or “cuda”)
Returns:
detoxai.core.interface submodule
- detoxai.core.interface.parse_methods_config(methods_config: dict) dict[source]
Here we compare what was passed and overwrite the default configuration
- Parameters:
methods_config – dict:
Returns:
- detoxai.core.interface.debias(model: Module, dataloader: DetoxaiDataLoader | DataLoader, methods: list[str] | str = 'all', metrics: list[str] | str = 'all', methods_config: dict = {}, pareto_metrics: list[str] = ['balanced_accuracy', 'equalized_odds'], return_type: str = 'all', device: str = 'cpu', include_vanila_in_results: bool = True, test_dataloader: DetoxaiDataLoader | DataLoader = None, num_of_classes: int | None = None) CorrectionResult | dict[str, CorrectionResult][source]
Run a suite of correction methods on the model and return the results
- Parameters:
model – Model to run the correction methods on
dataloader – DetoxaiDataLoader object with the dataset
harmful_concept – Concept to debias – this is the protected attribute # NOT SUPPORTED YET
methods – List of correction methods to run
metrics – List of metrics to include in the configuration
methods_config – Configuration for each correction method
pareto_metrics – List of metrics to use for the pareto front and selection of best method
return_type (optional) – Type of results to return. Options are ‘pareto-front’, ‘all’, ‘best’ “pareto-front”: Return the results CorrectionResult objects only for results on the pareto front “all”: Return the results for all correction methods “best”: Return the results for the best correction method, chosen with ideal point method from pareto front
device (optional) – Device to run the correction methods on
include_vanila_in_results (optional) – Include the vanilla model in the results
test_dataloader (optional) – DataLoader for the test dataset. If not provided, the original dataloader is used
num_of_classes (optional) – Number of classes in the dataset. Default is None, which means the number of classes will be inferred from the dataloader
- detoxai.core.interface.run_correction(method: str, method_kwargs: dict, pareto_metrics: list[str] | None = None) CorrectionResult[source]
Run the specified correction method
- Parameters:
method – Correction method to run
kwargs – Arguments for the correction method
method – str:
method_kwargs – dict:
pareto_metrics – list[str] | None: (Default value = None)
Returns:
detoxai.core.interface_helpers submodule
- detoxai.core.interface_helpers.load_supported_tags() dict[source]
From ./datasets/catalog/<dataset_name>/labels_mapping.yaml, load the dicts.
- detoxai.core.interface_helpers.construct_metrics_config(metrics: list[str] | str = 'all', types: str = 'GAP') dict[source]
Construct the metrics configuration for the fairness and performance metrics
- Parameters:
metrics – List of metrics to include in the configuration
types – Type of metric to use. Options are “GAP” or “RATIO”
metrics – list[str] | str: (Default value = “all”)
types – str: (Default value = “GAP”)
Returns:
- detoxai.core.interface_helpers.resolve_layer(model, layer) Module | None[source]
Resolve a layer name to a layer in the model
- Parameters:
model
layer
Returns:
- detoxai.core.interface_helpers.infer_layers(corrector, layers: list[str] | str) list[str][source]
Infer the layers to use for the correction method
There are wildcards available: - ‘last’: Use the last layer - ‘penultimate’: Use the penultimate layer Otherwise, a list of actual layer names can be passed
- Parameters:
corrector – Correction method object
layers – Layer specification
Returns:
detoxai.core.mcda_helpers submodule
- detoxai.core.mcda_helpers.is_pareto_efficient(costs: ndarray, return_mask: bool = True) ndarray[source]
Find the pareto-efficient points
- Parameters:
costs – An (n_points, n_costs) array
return_mask – True to return a mask
costs – np.ndarray:
return_mask – bool: (Default value = True)
- Returns:
An array of indices of pareto-efficient points. If return_mask is True, this will be an (n_points, ) boolean array Otherwise it will be a (n_efficient_points, ) integer array of indices.
Credit: https://stackoverflow.com/questions/32791911/fast-calculation-of-pareto-front-in-python
- detoxai.core.mcda_helpers.filter_pareto_front(results: dict[str, CorrectionResult]) dict[str, CorrectionResult][source]
Filter the results to only include those on the pareto front
- Parameters:
results – List of CorrectionResult objects to filter
results – list[CorrectionResult]:
Returns:
- detoxai.core.mcda_helpers.select_best_method(results: dict[str, CorrectionResult]) CorrectionResult[source]
Select the best correction method from the results using the ideal point method
- Parameters:
results – List of CorrectionResult objects to choose from
results – list[CorrectionResult]:
Returns:
detoxai.core.model_wrappers submodule
- class detoxai.core.model_wrappers.BaseLightningWrapper(model: ~torch.nn.modules.module.Module, criterion: ~torch.nn.modules.module.Module | None = CrossEntropyLoss(), performance_metrics: ~torchmetrics.collections.MetricCollection | None = None, learning_rate: float | None = 0.001, optimizer: ~torch.optim.optimizer.Optimizer | None = <class 'torch.optim.adam.Adam'>)[source]
Bases:
LightningModule
- class detoxai.core.model_wrappers.FairnessLightningWrapper(model: ~torch.nn.modules.module.Module, criterion: ~torch.nn.modules.module.Module | None = CrossEntropyLoss(), performance_metrics: ~torchmetrics.collections.MetricCollection | None = None, fairness_metrics: ~torchmetrics.collections.MetricCollection | None = None, learning_rate: float | None = 0.001, optimizer: ~torch.optim.optimizer.Optimizer | None = <class 'torch.optim.adam.Adam'>)[source]
Bases:
BaseLightningWrapper
detoxai.core.results_class submodule
- class detoxai.core.results_class.CorrectionResult(method: str, model: BaseLightningWrapper, metrics: dict)[source]
Bases:
object- get_model() BaseLightningWrapper[source]
detoxai.core.xai submodule
- class detoxai.core.xai.XAIMetricsCalculator(dataloader: DetoxaiDataLoader, lrphandler: LRPHandler)[source]
Bases:
object- calculate_metrics(model: Module, rect_pos: tuple[int, int], rect_size: tuple[int, int], vanilla_model: Module = None, sailmap_metrics: list[str] = ['RRF', 'HRF', 'MRR', 'DET', 'ADR', 'DIF', 'RDDT'], batches: int = 2, condition_on: str = 'proper_label', verbose: bool = False, neutral_point: float = 0.5, abs_on_neutral: bool = True) dict[str, float][source]
Calculate the metrics for the given model and sailmaps
- Parameters:
model – nn
rect_pos – tuple
rect_size – tuple
vanilla_model – nn
sailmap_metrics – list
batches – int
condition_on – str
verbose – bool
model – nn.Module:
rect_pos – tuple[int:
int]
rect_size – tuple[int:
vanilla_model – nn.Module: (Default value = None)
sailmap_metrics – list[str]:
batches – int: (Default value = 2)
condition_on – str: (Default value = ConditionOn.PROPER_LABEL.value)
verbose – bool: (Default value = False)
source_range (#) – tuple[float:
float] – (Default value = (0))
neutral_point – float: (Default value = 0.5)
abs_on_neutral – bool: (Default value = True)
- Returns:
The calculated metrics where the key is the metric name and the value is the calculated metric
- Return type:
dict[str, float]
- class detoxai.core.xai.SailRectMetric[source]
Bases:
ABC- calculate_batch(sailmaps: ndarray, rect_pos: tuple[int, int], rect_size: tuple[int, int], ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]
Calculate the metric for a single batch of sailmaps
- Parameters:
sailmaps – np.ndarray:
rect_pos – tuple[int:
int]
rect_size – tuple[int:
ret_format – tuple[str]: (Default value = (“mean”)
"std")
Returns:
- reduce(ret_format: tuple[str] = ('mean', 'std')) dict[str, float][source]
Calculate the metric for already aggregated sailmaps
- Parameters:
ret_format – tuple[str]: (Default value = (“mean”)
"std")
Returns:
- aggregate(sailmaps: ndarray, rect_pos: tuple[int, int], rect_size: tuple[int, int], vanilla_sailmaps: ndarray = None)[source]
Aggregate sailmaps for later calculation
- Parameters:
sailmaps – np.ndarray:
rect_pos – tuple[int:
int]
rect_size – tuple[int:
vanilla_sailmaps – np.ndarray: (Default value = None)
Returns:
- class detoxai.core.xai.RRF(**kwargs)[source]
Bases:
SailRectMetric- Rectangle Relevance Fraction
egin{equation} mathbf{RRF} =
- rac{displaystyle sum_{(i,j) in R} p_{ij}}{displaystyle sum_{i = 1}^N sum_{j = 1}^M p_{ij}}
end{equation}
Here, $mathbf{RRF}$ measures the fraction of total relevance that falls within ROI.
Args:
Returns:
- class detoxai.core.xai.HRF(epsilon: float = 0.05, **kwargs)[source]
Bases:
SailRectMetric- subsection{High-Relevance Fraction (HRF)}
egin{equation} mathbf{HRF} = displaystyle
rac{1}{ ert R ert} sum_{(i,j) in R} mathbbm{1}_{{p_{ij} > epsilon}}
end{equation}
$mathbf{HRF}$ quantifies the proportion of pixels inside the ROI whose relevance exceeds a predefined threshold $epsilon$, indicating how many pixels are highly important for prediction.
Args:
Returns:
- class detoxai.core.xai.MRR(**kwargs)[source]
Bases:
SailRectMetricsubsection{Mean Relevance Ratio (MRR)}
- egin{equation}
mathbf{MRR} =
rac{displaystyle rac{1}{ ert R ert} sum_{(i,j) in R} p_{ij}}{displaystyle rac{1}{N M - ert R ert} sum_{(i,j) otin R} p_{ij}},
end{equation} $mathbf{MRR}$ quantifies the ratio of the mean pixel value inside the ROI to the mean pixel value outside it. $mathbf{MRR} = 1$ indicates that the mean values are equal, while $mathbf{MRR} > 1$ says the mean pixel within the ROI has a higher intensity.
Args:
Returns:
- class detoxai.core.xai.DET(**kwargs)[source]
Bases:
SailRectMetricsubsection{Distribution Equivalence Testing (DET)}
The goal of the statistical test is to determine whether the pixels extit{inside} the rectangle have higher intensity than those extit{outside} the rectangle. Since the number of pixels and their intensity distributions inside and outside the ROI can vary, a non-parametric, unpaired statistical Mann-Whitney-Wilcoxon test is used. This permutation test assesses whether the intensity values from one group (inside) tend to be higher than those from the other (outside).
The null hypothesis $H_0$ for the test is that the intensity distributions inside and outside the rectangle are equal: egin{equation} egin{split}
H_0: F_{ ext{inside}}(x) &= F_{ ext{outside}}(x) H_1: F_{ ext{inside}}(x) &> F_{ ext{outside}}(x)
end{split} end{equation}
To perform the test, all pixel intensities are ranked, and the sum of ranks for each group (inside and outside the ROI) is computed. The test then evaluates the probability that the intensity values inside the rectangle are statistically higher than those outside. The final outcome of the DET is a binary decision: extbf{TRUE} indicates that the null hypothesis is rejected (i.e., there is statistically significant evidence that the pixels inside the rectangle have higher intensity), while extbf{FALSE} signifies that we fail to reject the null hypothesis, meaning that the evidence is inconclusive regarding a higher intensity inside the rectangle.
Args:
Returns:
- class detoxai.core.xai.ADR(**kwargs)[source]
Bases:
SailRectMetricAverage Difference in Region (ADR)
ADR measures the mean pixel-wise difference between vanilla and debiased saliency maps within the region of interest (ROI). A positive value indicates that vanilla saliency values are generally higher than debiased ones in the region.
Args:
Returns:
- class detoxai.core.xai.DIF(eps: float = 0.001, **kwargs)[source]
Bases:
SailRectMetricDecreased Intensity Fraction (DIF)
DIF measures the ratio of pixels showing decreased intensity in the debiased model compared to the vanilla model. It represents the fraction of pixels inside a rectangle that significantly flipped their saliency value.
Args:
Returns:
- class detoxai.core.xai.RDDT(**kwargs)[source]
Bases:
SailRectMetricRectangle Difference Distribution Testing (RDDT)
Performs a Wilcoxon signed rank test to determine if pixels from the vanilla model have significantly higher intensity than those from the debiased model within the ROI. Returns 1 if the test rejects the null hypothesis (indicating vanilla has higher intensity), 0 otherwise.
Args:
Returns: