detoxai.methods.leace

Submodules

detoxai.methods.leace.leace submodule

class detoxai.methods.leace.leace.LEACE(model: Module | LightningModule, experiment_name: str, device: str, **kwargs)[source]

Bases: ModelCorrectionMethod

extract_activations(dataloader: DataLoader, intervention_layers: list[str], use_cache: bool = True, save_dir: str = '/home/docs/.detoxai/activations') → None[source]

Parameters:

dataloader – torch.utils.data.DataLoader:
intervention_layers – list[str]:
use_cache – bool: (Default value = True)
save_dir – str: (Default value = ACTIVATIONS_DIR)

Returns:

apply_model_correction(intervention_layers: list[str], use_n_examples: int = 15000, **kwargs) → None[source]

Apply the LEACE eraser to the specified layers of the model.

Parameters:

intervention_layers – list[str]:
use_n_examples – int: (Default value = 15_000)
**kwargs

Returns:

add_clarc_hook(eraser: LeaceEraser, layer_names: list) → None[source]

Applies debiasing to the specified layers of a PyTorch model using the provided CAV.

Parameters:

model (nn.Module) – The PyTorch model to be debiased.
cav (torch.Tensor) – The Concept Activation Vector, shape (channels,).
mean_length (torch.Tensor) – Mean activation length of the unaffected activations.
layer_names (list) – List of layer names (strings) to apply the hook on.
alpha (float) – Scaling factor for the debiasing.
eraser – LeaceEraser:
layer_names – list:

Returns:

A list of hook handles. Keep them to remove hooks later if needed.

Return type:

list

detoxai.methods.leace

Submodules

detoxai.methods.leace.leace submodule

Module contents