detoxai.methods.posthoc
Submodules
detoxai.methods.posthoc.naive_threshold submodule
- class detoxai.methods.posthoc.naive_threshold.NaiveThresholdOptimizer(model: Module | LightningModule, experiment_name: str, device: str, dataloader: DetoxaiDataLoader, outputs_are_logits: bool = True, **kwargs: Any)[source]
Bases:
PosthocBaseOptimizes classification threshold using forward hooks.
- apply_model_correction(last_layer_name: str, threshold_range: Tuple[float, float] = (0.05, 0.95), objective_function: Callable[[float, float], float] | None = None, threshold_steps: int = 100, metric: str = 'EO_GAP', **kwargs: Any) None[source]
Applies threshold modification hook to model.
- Parameters:
last_layer_name – str:
threshold_range – Tuple[float:
float] – (Default value = (0.05)
0.95)
objective_function – Optional[Callable[[float:
float]] – (Default value = None)
threshold_steps – int: (Default value = 100)
metric – str: (Default value = “EO_GAP”)
**kwargs – Any:
Returns:
detoxai.methods.posthoc.posthoc_base submodule
- class detoxai.methods.posthoc.posthoc_base.PosthocBase(model: Module | LightningModule, experiment_name: str, device: str, **kwargs)[source]
Bases:
ModelCorrectionMethod,ABCAbstract base class for binary post-hoc debiasing methods.
detoxai.methods.posthoc.reject_option_classification submodule
- class detoxai.methods.posthoc.reject_option_classification.ROCModelWrapper(base_model: Module, theta: float, L_values: Dict[int, int])[source]
Bases:
Module
- class detoxai.methods.posthoc.reject_option_classification.RejectOptionClassification(model: Module, experiment_name: str, device: str, dataloader: DetoxaiDataLoader, theta_range: Tuple[float, float] = (0.55, 0.95), theta_steps: int = 20, metric: str = 'EO_GAP', objective_function: Callable[[float, float], float] | None = None, **kwargs: Any)[source]
Bases:
PosthocBaseImplements Reject Option Classification (ROC) for fairness optimization.
This class implements a post-hoc fairness optimization method that modifies model predictions based on a confidence threshold (theta). Predictions with confidence below theta are flipped to optimize for both accuracy and fairness.
Args:
Returns: