detoxai.methods.savani

Submodules

detoxai.methods.savani.adversarial submodule

class detoxai.methods.savani.adversarial.SavaniAFT(model: Module | LightningModule, experiment_name: str, device: str, seed: int = 123, **kwargs)[source]

Bases: SavaniBase

apply_model_correction(dataloader: DataLoader, last_layer_name: str, epsilon: float = 0.1, bias_metric: BiasMetrics | str = BiasMetrics.EO_GAP, iterations: int = 10, critic_iterations: int = 5, model_iterations: int = 5, train_batch_size: int = 128, thresh_optimizer_maxiter: int = 100, tau_init: float = 0.5, lam: float = 1.0, delta: float = 0.01, critic_lr: float = 0.0001, model_lr: float = 0.0001, critic_filters: list[int] = [8, 16, 32], critic_linear: list[int] = [32], outputs_are_logits: bool = True, n_eval_batches: int = 3, soft_thresh_temperature: float = 10.0, **kwargs) None[source]

backward Do layer-wise optimization to find the best weights for each layer and the best threshold tau

Parameters:
  • dataloader – DataLoader:

  • last_layer_name – str:

  • epsilon – float: (Default value = 0.1)

  • bias_metric – BiasMetrics | str: (Default value = BiasMetrics.EO_GAP)

  • iterations – int: (Default value = 10)

  • critic_iterations – int: (Default value = 5)

  • model_iterations – int: (Default value = 5)

  • train_batch_size – int: (Default value = 128)

  • thresh_optimizer_maxiter – int: (Default value = 100)

  • tau_init – float: (Default value = 0.5)

  • lam – float: (Default value = 1.0)

  • delta – float: (Default value = 0.01)

  • critic_lr – float: (Default value = 1e-4)

  • model_lr – float: (Default value = 1e-4)

  • critic_filters – list[int]: (Default value = [8)

  • 16

  • 32]

  • critic_linear – list[int]: (Default value = [32])

  • outputs_are_logits – bool: (Default value = True)

  • n_eval_batches – int: (Default value = 3)

  • soft_thresh_temperature – float: (Default value = 10.0)

  • **kwargs

Returns:

fair_loss(y_logits, y_true, input)[source]
Parameters:
  • y_logits

  • y_true

  • input

Returns:

get_critic(channels: int, critic_filters: list[int], critic_linear: list[int], batch_size: int) Module[source]
Parameters:
  • channels – int:

  • critic_filters – list[int]:

  • critic_linear – list[int]:

  • batch_size – int:

Returns:

detoxai.methods.savani.lay_wis_opt submodule

class detoxai.methods.savani.lay_wis_opt.SavaniLWO(model: Module | LightningModule, experiment_name: str, device: str, seed: int = 123, **kwargs)[source]

Bases: SavaniBase

apply_model_correction(dataloader: DataLoader, last_layer_name: str, epsilon: float = 0.1, bias_metric: BiasMetrics | str = BiasMetrics.EO_GAP, n_layers_to_optimize: int | str = 'all', thresh_optimizer_maxiter: int = 100, beta: float = 2.2, params_to_opt: int | float = 0.5, never_more_than: int = 50000, tau_init: float = 0.5, outputs_are_logits: bool = True, n_eval_batches: int = 3, eval_batch_size: int = 128, skopt_verbose: bool = False, skopt_njobs: int = 4, skopt_npoints: int = 1000, skopt_maxiter: int = 10, soft_thresh_temperature: float = 10.0, **kwargs) None[source]

Do layer-wise optimization to find the best weights for each layer and the best threshold tau

Parameters:
  • dataloader – DataLoader:

  • last_layer_name – str:

  • epsilon – float: (Default value = 0.1)

  • bias_metric – BiasMetrics | str: (Default value = BiasMetrics.EO_GAP)

  • n_layers_to_optimize – int | str: (Default value = “all”)

  • thresh_optimizer_maxiter – int: (Default value = 100)

  • beta – float: (Default value = 2.2)

  • params_to_opt – int | float: (Default value = 0.5)

  • never_more_than – int: (Default value = 50_000)

  • tau_init – float: (Default value = 0.5)

  • outputs_are_logits – bool: (Default value = True)

  • n_eval_batches – int: (Default value = 3)

  • eval_batch_size – int: (Default value = 128)

  • skopt_verbose – bool: (Default value = False)

  • skopt_njobs – int: (Default value = 4)

  • skopt_npoints – int: (Default value = 1000)

  • skopt_maxiter – int: (Default value = 10)

  • soft_thresh_temperature – float: (Default value = 10.0)

  • **kwargs

Returns:

objective_LWO(o_params: Tensor, tau: float, indices: list) callable[source]

Objective function for the layer-wise optimization

Parameters:
  • o_params – The original parameters (torch.Tensor)

  • tau – The threshold value (float)

  • indices – The indices of the selected neurons (list)

  • o_params – torch.Tensor:

  • tau – float:

  • indices – list:

Returns:

The objective function

flatten_select(params: Tensor, select_cnt: float | int, total_params: int) tuple[Tensor, list][source]

Take an n-dimensional array,

Args:

Parameters:
  • select_cnt – The number of neurons to select

  • total_params – The total number of parameters

  • params – torch.Tensor:

  • select_cnt – float | int:

  • total_params – int:

Returns:

A 1-dimensional array of selected neurons A 1-dimensional array of indices of the selected neurons

unflatten(o_params: Tensor, f_params: Tensor, indices: list) Tensor[source]

Unflatten the parameters

Parameters:
  • o_params – The original parameters

  • f_params – The flattened parameters

  • indices – The indices of the selected neurons

  • o_params – torch.Tensor:

  • f_params – torch.Tensor:

  • indices – list:

Returns:

The unflattened parameters

detoxai.methods.savani.random_perturbation submodule

class detoxai.methods.savani.random_perturbation.SavaniRP(model: Module | LightningModule, experiment_name: str, device: str, seed: int = 123, **kwargs)[source]

Bases: SavaniBase

apply_model_correction(dataloader: DataLoader, last_layer_name: str, epsilon: float = 0.1, T_iters: int = 15, bias_metric: BiasMetrics | str = BiasMetrics.EO_GAP, optimizer_maxiter: int = 100, tau_init: float = 0.5, outputs_are_logits: bool = True, options: dict = {}, eval_batch_size: int = 128, n_eval_batches: int = 3, soft_thresh_temperature: float = 10.0, **kwargs) None[source]

Apply random weights perturbation to the model, then select threshold ‘tau’ that maximizes phi

To change perturbation parameters, you can pass the mean and std of the Gaussian noise options = {‘mean’: 1.0, ‘std’: 0.1}

Parameters:
  • dataloader – DataLoader:

  • last_layer_name – str:

  • epsilon – float: (Default value = 0.1)

  • T_iters – int: (Default value = 15)

  • bias_metric – BiasMetrics | str: (Default value = BiasMetrics.EO_GAP)

  • optimizer_maxiter – int: (Default value = 100)

  • tau_init – float: (Default value = 0.5)

  • outputs_are_logits – bool: (Default value = True)

  • options – dict: (Default value = {})

  • eval_batch_size – int: (Default value = 128)

  • n_eval_batches – int: (Default value = 3)

  • soft_thresh_temperature – float: (Default value = 10.0)

  • **kwargs

Returns:

detoxai.methods.savani.savani_base submodule

class detoxai.methods.savani.savani_base.SavaniBase(model: Module | LightningModule, experiment_name: str, device: str, seed: int = 123)[source]

Bases: ModelCorrectionMethod, ABC

abstractmethod apply_model_correction() None[source]
optimize_tau(tau_init: float, thresh_optimizer_maxiter: int) tuple[float, float][source]
Parameters:
  • tau_init – float:

  • thresh_optimizer_maxiter – int:

Returns:

objective_thresh(backend: str, cache_preds: bool = True, direction: str = 'min') callable[source]
Parameters:
  • backend – str:

  • cache_preds – bool: (Default value = True)

  • direction – str: (Default value = “min”)

Returns:

phi_torch(tau: Tensor, cached: tuple | None = None) tuple[Tensor, Tensor][source]

Calculate the phi metric for a given threshold tau

Parameters:
  • tau – torch.Tensor:

  • cached – tuple | None: (Default value = None)

Returns:

apply_hook(tau: float, temperature: float = 100) None[source]
Parameters:
  • tau – float:

  • temperature – float: (Default value = 100)

Returns:

get_pred_true_prot() tuple[Tensor, Tensor, Tensor][source]
check_layer_name_exists(layer_name: str) bool[source]
Parameters:

layer_name – str:

Returns:

sample_batch() tuple[Tensor, Tensor, Tensor][source]

Sample a single batch from a dataloader

initialize_dataloader(dataloader: DataLoader, batch_size: int) None[source]
Parameters:
  • dataloader – DataLoader:

  • batch_size – int:

Returns:

detoxai.methods.savani.utils submodule

detoxai.methods.savani.utils.phi_torch(Y_true: Tensor, Y_pred: Tensor, ProtAttr: Tensor, epsilon: float = 0.05, bias_metric: BiasMetrics | str = BiasMetrics.TPR_GAP) tuple[Tensor, Tensor][source]

Calculate phi as in the paper

phi = balanced_accuracy(Y_true, Y_pred) if bias < epsilon else 0

Parameters:
  • Y_true – torch.Tensor:

  • Y_pred – torch.Tensor:

  • ProtAttr – torch.Tensor:

  • epsilon – float: (Default value = 0.05)

  • bias_metric – BiasMetrics | str: (Default value = BiasMetrics.TPR_GAP)

Returns:

detoxai.methods.savani.zhang submodule

class detoxai.methods.savani.zhang.ZhangM(model: Module | LightningModule, experiment_name: str, device: str, seed: int = 123, **kwargs)[source]

Bases: SavaniBase

Brian Hu Zhang, Blake Lemoine, Margaret Mitchell - “Mitigating unwanted biases with adversarial learning

apply_model_correction(dataloader: DataLoader, last_layer_name: str, epsilon: float = 0.1, bias_metric: BiasMetrics | str = BiasMetrics.EO_GAP, iterations: int = 5, critic_iterations: int = 5, model_iterations: int = 2, train_batch_size: int = 128, thresh_optimizer_maxiter: int = 100, tau_init: float = 0.5, critic_lr: float = 0.0002, model_lr: float = 0.0001, critic_linear: list[int] = [256, 256, 256], outputs_are_logits: bool = True, n_eval_batches: int = 3, soft_thresh_temperature: float = 10.0, **kwargs) None[source]

backward Do layer-wise optimization to find the best weights for each layer and the best threshold tau

In options you can specify that your model already outputs probabilities, in which case the model will not apply the softmax function options = {‘outputs_are_logits’: False}

Parameters:
  • dataloader – DataLoader:

  • last_layer_name – str:

  • epsilon – float: (Default value = 0.1)

  • bias_metric – BiasMetrics | str: (Default value = BiasMetrics.EO_GAP)

  • iterations – int: (Default value = 5)

  • critic_iterations – int: (Default value = 5)

  • model_iterations – int: (Default value = 2)

  • train_batch_size – int: (Default value = 128)

  • thresh_optimizer_maxiter – int: (Default value = 100)

  • tau_init – float: (Default value = 0.5)

  • alpha (#) – float: (Default value = 5.0)

  • critic_lr – float: (Default value = 2e-4)

  • model_lr – float: (Default value = 1e-4)

  • critic_linear – list[int]: (Default value = [256)

  • 256

  • 256]

  • outputs_are_logits – bool: (Default value = True)

  • n_eval_batches – int: (Default value = 3)

  • soft_thresh_temperature – float: (Default value = 10.0)

  • **kwargs

Returns:

get_critic(input_dim: int, critic_linear: list[int]) Module[source]
Parameters:
  • input_dim – int:

  • critic_linear – list[int]:

Returns:

Module contents