5. Losses

5.1 `KLDivergenceLoss`

Compute Kullback-Leibler divergence across Bayesian modules. This loss sums the KL divergence from all Bayesian layers in the model. It can be reduced by averaging and scaled by a weight factor.

Notes

Assumes the model contains submodules derived from BayesianModule.

Source code in illia/losses/torch/kl.py

class KLDivergenceLoss(torch.nn.Module):
    """
    Compute Kullback-Leibler divergence across Bayesian modules.
    This loss sums the KL divergence from all Bayesian layers in
    the model. It can be reduced by averaging and scaled by a
    weight factor.

    Notes:
        Assumes the model contains submodules derived from
        `BayesianModule`.
    """

    def __init__(
        self,
        reduction: Literal["mean"] = "mean",
        weight: float = 1.0,
        **kwargs: Any,
    ) -> None:
        """
        Initialize the KL divergence loss.

        Args:
            reduction: Method used to reduce the KL loss.
            weight: Scaling factor for the KL divergence.
            **kwargs: Extra arguments passed to the base class.

        Returns:
            None
        """

        # call super class constructor
        super().__init__(**kwargs)

        self.reduction = reduction
        self.weight = weight

    def forward(self, model: torch.nn.Module) -> torch.Tensor:
        """
        Compute KL divergence for all Bayesian modules in a model.

        Args:
            model: Model containing Bayesian submodules.

        Returns:
            torch.Tensor: Weighted KL divergence loss.

        Notes:
            The KL loss is averaged over the number of parameters
            and scaled by the `weight` attribute.
        """

        # Get device and dtype
        parameter: torch.nn.Parameter = next(model.parameters())
        device: torch.device = parameter.device
        dtype = parameter.dtype

        # Init kl cost and params
        kl_global_cost: torch.Tensor = torch.tensor(0, device=device, dtype=dtype)
        num_params_global: int = 0

        # Iter over modules
        for module in model.modules():
            if isinstance(module, BayesianModule):
                kl_cost, num_params = module.kl_cost()
                kl_global_cost += kl_cost
                num_params_global += num_params

        # Average by the number of parameters
        kl_global_cost /= num_params
        kl_global_cost *= self.weight

        return kl_global_cost

5.1.1 `init(reduction='mean', weight=1.0, **kwargs)`

Initialize the KL divergence loss.

Parameters:

Name	Type	Description	Default
`reduction`	`Literal['mean']`	Method used to reduce the KL loss.	`'mean'`
`weight`	`float`	Scaling factor for the KL divergence.	`1.0`
`**kwargs`	`Any`	Extra arguments passed to the base class.	`{}`

Returns:

Type	Description
`None`	None

Source code in illia/losses/torch/kl.py

def __init__(
    self,
    reduction: Literal["mean"] = "mean",
    weight: float = 1.0,
    **kwargs: Any,
) -> None:
    """
    Initialize the KL divergence loss.

    Args:
        reduction: Method used to reduce the KL loss.
        weight: Scaling factor for the KL divergence.
        **kwargs: Extra arguments passed to the base class.

    Returns:
        None
    """

    # call super class constructor
    super().__init__(**kwargs)

    self.reduction = reduction
    self.weight = weight

5.1.2 `forward(model)`

Compute KL divergence for all Bayesian modules in a model.

Parameters:

Name	Type	Description	Default
`model`	`Module`	Model containing Bayesian submodules.	required

Returns:

Type	Description
`Tensor`	torch.Tensor: Weighted KL divergence loss.

Notes

The KL loss is averaged over the number of parameters and scaled by the weight attribute.

Source code in illia/losses/torch/kl.py

def forward(self, model: torch.nn.Module) -> torch.Tensor:
    """
    Compute KL divergence for all Bayesian modules in a model.

    Args:
        model: Model containing Bayesian submodules.

    Returns:
        torch.Tensor: Weighted KL divergence loss.

    Notes:
        The KL loss is averaged over the number of parameters
        and scaled by the `weight` attribute.
    """

    # Get device and dtype
    parameter: torch.nn.Parameter = next(model.parameters())
    device: torch.device = parameter.device
    dtype = parameter.dtype

    # Init kl cost and params
    kl_global_cost: torch.Tensor = torch.tensor(0, device=device, dtype=dtype)
    num_params_global: int = 0

    # Iter over modules
    for module in model.modules():
        if isinstance(module, BayesianModule):
            kl_cost, num_params = module.kl_cost()
            kl_global_cost += kl_cost
            num_params_global += num_params

    # Average by the number of parameters
    kl_global_cost /= num_params
    kl_global_cost *= self.weight

    return kl_global_cost

5.2 `ELBOLoss`

Compute the Evidence Lower Bound (ELBO) loss for Bayesian networks. Combines a reconstruction loss with a KL divergence term. Monte Carlo sampling can estimate the expected reconstruction loss over stochastic layers.

Notes

The KL term is weighted by kl_weight. The model is assumed to contain Bayesian layers compatible with KLDivergenceLoss.

Source code in illia/losses/torch/elbo.py

class ELBOLoss(torch.nn.Module):
    """
    Compute the Evidence Lower Bound (ELBO) loss for Bayesian
    networks. Combines a reconstruction loss with a KL divergence
    term. Monte Carlo sampling can estimate the expected
    reconstruction loss over stochastic layers.

    Notes:
        The KL term is weighted by `kl_weight`. The model is
        assumed to contain Bayesian layers compatible with
        `KLDivergenceLoss`.
    """

    def __init__(
        self,
        loss_function: torch.nn.Module,
        num_samples: int = 1,
        kl_weight: float = 1e-3,
        **kwargs: Any,
    ) -> None:
        """
        Initialize the ELBO loss with reconstruction and KL
        components.

        Args:
            loss_function: Function or module used to compute
                reconstruction loss.
            num_samples: Number of Monte Carlo samples used for
                estimation.
            kl_weight: Weight applied to the KL divergence term.
            **kwargs: Extra arguments passed to the base class.

        Returns:
            None
        """

        super().__init__(**kwargs)

        self.loss_function = loss_function
        self.num_samples = num_samples
        self.kl_weight = kl_weight
        self.kl_loss = KLDivergenceLoss(weight=kl_weight)

    def forward(
        self, outputs: torch.Tensor, targets: torch.Tensor, model: torch.nn.Module
    ) -> torch.Tensor:
        """
        Compute the ELBO loss with Monte Carlo sampling and KL
        regularization.

        Args:
            outputs: Predictions generated by the model.
            targets: Ground truth values for training.
            model: Model containing Bayesian layers.

        Returns:
            torch.Tensor: Scalar ELBO loss averaged over samples.

        Notes:
            The loss is averaged over `num_samples` Monte Carlo
            draws.
        """

        loss_value = torch.tensor(
            0, device=next(model.parameters()).device, dtype=torch.float32
        )

        for _ in range(self.num_samples):
            loss_value += self.loss_function(outputs, targets) + self.kl_loss(model)

        loss_value /= self.num_samples

        return loss_value

5.2.1 `init(loss_function, num_samples=1, kl_weight=0.001, **kwargs)`

Initialize the ELBO loss with reconstruction and KL components.

Parameters:

Name	Type	Description	Default
`loss_function`	`Module`	Function or module used to compute reconstruction loss.	required
`num_samples`	`int`	Number of Monte Carlo samples used for estimation.	`1`
`kl_weight`	`float`	Weight applied to the KL divergence term.	`0.001`
`**kwargs`	`Any`	Extra arguments passed to the base class.	`{}`

Returns:

Type	Description
`None`	None

Source code in illia/losses/torch/elbo.py

def __init__(
    self,
    loss_function: torch.nn.Module,
    num_samples: int = 1,
    kl_weight: float = 1e-3,
    **kwargs: Any,
) -> None:
    """
    Initialize the ELBO loss with reconstruction and KL
    components.

    Args:
        loss_function: Function or module used to compute
            reconstruction loss.
        num_samples: Number of Monte Carlo samples used for
            estimation.
        kl_weight: Weight applied to the KL divergence term.
        **kwargs: Extra arguments passed to the base class.

    Returns:
        None
    """

    super().__init__(**kwargs)

    self.loss_function = loss_function
    self.num_samples = num_samples
    self.kl_weight = kl_weight
    self.kl_loss = KLDivergenceLoss(weight=kl_weight)

5.2.2 `forward(outputs, targets, model)`

Compute the ELBO loss with Monte Carlo sampling and KL regularization.

Parameters:

Name	Type	Description	Default
`outputs`	`Tensor`	Predictions generated by the model.	required
`targets`	`Tensor`	Ground truth values for training.	required
`model`	`Module`	Model containing Bayesian layers.	required

Returns:

Type	Description
`Tensor`	torch.Tensor: Scalar ELBO loss averaged over samples.

Notes

The loss is averaged over num_samples Monte Carlo draws.

Source code in illia/losses/torch/elbo.py

def forward(
    self, outputs: torch.Tensor, targets: torch.Tensor, model: torch.nn.Module
) -> torch.Tensor:
    """
    Compute the ELBO loss with Monte Carlo sampling and KL
    regularization.

    Args:
        outputs: Predictions generated by the model.
        targets: Ground truth values for training.
        model: Model containing Bayesian layers.

    Returns:
        torch.Tensor: Scalar ELBO loss averaged over samples.

    Notes:
        The loss is averaged over `num_samples` Monte Carlo
        draws.
    """

    loss_value = torch.tensor(
        0, device=next(model.parameters()).device, dtype=torch.float32
    )

    for _ in range(self.num_samples):
        loss_value += self.loss_function(outputs, targets) + self.kl_loss(model)

    loss_value /= self.num_samples

    return loss_value

5. Losses

5.1 KLDivergenceLoss

5.1.1 __init__(reduction='mean', weight=1.0, **kwargs)

5.1.2 forward(model)

5.2 ELBOLoss

5.2.1 __init__(loss_function, num_samples=1, kl_weight=0.001, **kwargs)

5.2.2 forward(outputs, targets, model)

5.1 `KLDivergenceLoss`

5.1.1 `init(reduction='mean', weight=1.0, **kwargs)`

5.1.2 `forward(model)`

5.2 `ELBOLoss`

5.2.1 `init(loss_function, num_samples=1, kl_weight=0.001, **kwargs)`

5.2.2 `forward(outputs, targets, model)`