detoxai.methods.leace
Submodules
detoxai.methods.leace.leace submodule
- class detoxai.methods.leace.leace.LEACE(model: Module | LightningModule, experiment_name: str, device: str, **kwargs)[source]
Bases:
ModelCorrectionMethod- extract_activations(dataloader: DataLoader, intervention_layers: list[str], use_cache: bool = True, save_dir: str = '/home/docs/.detoxai/activations') None[source]
- Parameters:
dataloader – torch.utils.data.DataLoader:
intervention_layers – list[str]:
use_cache – bool: (Default value = True)
save_dir – str: (Default value = ACTIVATIONS_DIR)
Returns:
- apply_model_correction(intervention_layers: list[str], use_n_examples: int = 15000, **kwargs) None[source]
Apply the LEACE eraser to the specified layers of the model.
- Parameters:
intervention_layers – list[str]:
use_n_examples – int: (Default value = 15_000)
**kwargs
Returns:
- add_clarc_hook(eraser: LeaceEraser, layer_names: list) None[source]
Applies debiasing to the specified layers of a PyTorch model using the provided CAV.
- Parameters:
model (nn.Module) – The PyTorch model to be debiased.
cav (torch.Tensor) – The Concept Activation Vector, shape (channels,).
mean_length (torch.Tensor) – Mean activation length of the unaffected activations.
layer_names (list) – List of layer names (strings) to apply the hook on.
alpha (float) – Scaling factor for the debiasing.
eraser – LeaceEraser:
layer_names – list:
- Returns:
A list of hook handles. Keep them to remove hooks later if needed.
- Return type:
list