7. Distributions

7.1 `DistributionModule`

Abstract base for probabilistic distribution modules in Tensorflow. Defines the required interface for sampling, computing log-probabilities, and counting learnable parameters.

Notes

This class is abstract and cannot be instantiated directly. All abstract methods must be implemented by subclasses.

Source code in illia/distributions/tf/base.py

@saving.register_keras_serializable(package="illia", name="DistributionModule")
class DistributionModule(layers.Layer, ABC):
    """
    Abstract base for probabilistic distribution modules in Tensorflow.
    Defines the required interface for sampling, computing
    log-probabilities, and counting learnable parameters.

    Notes:
        This class is abstract and cannot be instantiated directly.
        All abstract methods must be implemented by subclasses.
    """

    @abstractmethod
    def sample(self) -> tf.Tensor:
        """
        Draw a sample from the distribution.

        Returns:
            tf.Tensor: A sample drawn from the distribution.
        """

    @abstractmethod
    def log_prob(self, x: Optional[tf.Tensor] = None) -> tf.Tensor:
        """
        Compute the log-probability of a provided sample. If no
        sample is passed, one is drawn internally.

        Args:
            x: Optional sample to evaluate. If None, a new sample is
                drawn from the distribution.

        Returns:
            tf.Tensor: Scalar log-probability value.

        Notes:
            Works with both user-supplied and internally drawn
            samples.
        """

    @property
    @abstractmethod
    def num_params(self) -> int:
        """
        Return the number of learnable parameters in the
        distribution.

        Returns:
            int: Total count of learnable parameters.
        """

7.1.1 `num_params` `abstractmethod` `property`

Return the number of learnable parameters in the distribution.

Returns:

Name	Type	Description
`int`	`int`	Total count of learnable parameters.

7.1.2 `log_prob(x=None)` `abstractmethod`

Compute the log-probability of a provided sample. If no sample is passed, one is drawn internally.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Optional sample to evaluate. If None, a new sample is drawn from the distribution.	`None`

Returns:

Type	Description
`Tensor`	tf.Tensor: Scalar log-probability value.

Notes

Works with both user-supplied and internally drawn samples.

Source code in illia/distributions/tf/base.py

@abstractmethod
def log_prob(self, x: Optional[tf.Tensor] = None) -> tf.Tensor:
    """
    Compute the log-probability of a provided sample. If no
    sample is passed, one is drawn internally.

    Args:
        x: Optional sample to evaluate. If None, a new sample is
            drawn from the distribution.

    Returns:
        tf.Tensor: Scalar log-probability value.

    Notes:
        Works with both user-supplied and internally drawn
        samples.
    """

7.1.3 `sample()` `abstractmethod`

Draw a sample from the distribution.

Returns:

Type	Description
`Tensor`	tf.Tensor: A sample drawn from the distribution.

Source code in illia/distributions/tf/base.py

@abstractmethod
def sample(self) -> tf.Tensor:
    """
    Draw a sample from the distribution.

    Returns:
        tf.Tensor: A sample drawn from the distribution.
    """

7.2 `GaussianDistribution`

Learnable Gaussian distribution with diagonal covariance. Represents a Gaussian with trainable mean and standard deviation. The standard deviation is derived from rho using a softplus transformation to ensure positivity.

Notes

Assumes diagonal covariance. KL divergence can be estimated via log-probability differences from log_prob.

Source code in illia/distributions/tf/gaussian.py

@saving.register_keras_serializable(package="illia", name="GaussianDistribution")
class GaussianDistribution(DistributionModule):
    """
    Learnable Gaussian distribution with diagonal covariance.
    Represents a Gaussian with trainable mean and standard
    deviation. The standard deviation is derived from `rho`
    using a softplus transformation to ensure positivity.

    Notes:
        Assumes diagonal covariance. KL divergence can be
        estimated via log-probability differences from
        `log_prob`.
    """

    def __init__(
        self,
        shape: tuple[int, ...],
        mu_prior: float = 0.0,
        std_prior: float = 0.1,
        mu_init: float = 0.0,
        rho_init: float = -7.0,
        **kwargs: Any,
    ) -> None:
        """
        Initialize a learnable Gaussian distribution layer.

        Args:
            shape: Shape of the learnable parameters.
            mu_prior: Mean of the Gaussian prior.
            std_prior: Standard deviation of the prior.
            mu_init: Initial value for the learnable mean.
            rho_init: Initial value for the learnable rho.
            **kwargs: Extra arguments passed to the base class.

        Returns:
            None
        """

        super().__init__(**kwargs)

        self.shape = shape
        self.mu_prior_value = mu_prior
        self.std_prior_value = std_prior
        self.mu_init = mu_init
        self.rho_init = rho_init

        self.build(self.shape)

    def build(self, input_shape: tf.TensorShape) -> None:
        """
        Build trainable and non-trainable parameters.

        Args:
            input_shape: Input shape used to trigger layer build.

        Returns:
            None
        """

        # Define non-trainable priors variables
        self.mu_prior = self.add_weight(
            shape=(),
            initializer=tf.constant_initializer(self.mu_prior_value),
            trainable=False,
            name="mu_prior",
        )

        self.std_prior = self.add_weight(
            shape=(),
            initializer=tf.constant_initializer(self.std_prior_value),
            trainable=False,
            name="std_prior",
        )

        # Define trainable parameters
        self.mu = self.add_weight(
            shape=self.shape,
            initializer=tf.random_normal_initializer(mean=self.mu_init, stddev=0.1),
            trainable=True,
            name="mu",
        )

        self.rho = self.add_weight(
            shape=self.shape,
            initializer=tf.random_normal_initializer(mean=self.rho_init, stddev=0.1),
            trainable=True,
            name="rho",
        )

        super().build(input_shape)

    def get_config(self) -> dict:
        """
        Return the configuration dictionary for serialization.

        Returns:
            dict: Dictionary with the layer configuration.
        """

        base_config = super().get_config()

        config = {
            "shape": self.shape,
            "mu_prior": self.mu_prior_value,
            "std_prior": self.std_prior_value,
            "mu_init": self.mu_init,
            "rho_init": self.rho_init,
        }

        return {**base_config, **config}

    def sample(self) -> tf.Tensor:
        """
        Draw a sample from the Gaussian distribution.

        Returns:
            tf.Tensor: A sample drawn from the distribution.
        """

        # Sampling with reparametrization trick
        eps: tf.Tensor = tf.random.normal(shape=self.rho.shape)
        sigma: tf.Tensor = tf.math.log1p(tf.math.exp(self.rho))

        return self.mu + sigma * eps

    def log_prob(self, x: Optional[tf.Tensor] = None) -> tf.Tensor:
        """
        Compute the log-probability of a given sample. If no
        sample is provided, one is drawn internally.

        Args:
            x: Optional input sample to evaluate. If None,
                a new sample is drawn from the distribution.

        Returns:
            tf.Tensor: Scalar log-probability value.

        Notes:
            Supports both user-supplied and internally drawn
            samples.
        """

        # Sample if x is None
        if x is None:
            x = self.sample()

        # Define pi variable
        pi: tf.Tensor = tf.convert_to_tensor(math.pi)

        # Compute log priors
        log_prior = (
            -tf.math.log(tf.math.sqrt(2 * pi))
            - tf.math.log(self.std_prior)
            - (((x - self.mu_prior) ** 2) / (2 * self.std_prior**2))
            - 0.5
        )

        # Compute sigma
        sigma: tf.Tensor = tf.math.log1p(tf.math.exp(self.rho))

        # Compute log posteriors
        log_posteriors: tf.Tensor = (
            -tf.math.log(tf.math.sqrt(2 * pi))
            - tf.math.log(sigma)
            - (((x - self.mu) ** 2) / (2 * sigma**2))
            - 0.5
        )

        # Compute final log probs
        log_probs = tf.math.reduce_sum(log_posteriors) - tf.math.reduce_sum(log_prior)

        return log_probs

    @property
    def num_params(self) -> int:
        """
        Return the number of learnable parameters in the
        distribution.

        Returns:
            int: Total count of learnable parameters.
        """

        return int(tf.size(self.mu))

7.2.1 `num_params` `property`

Return the number of learnable parameters in the distribution.

Returns:

Name	Type	Description
`int`	`int`	Total count of learnable parameters.

7.2.2 `init(shape, mu_prior=0.0, std_prior=0.1, mu_init=0.0, rho_init=-7.0, **kwargs)`

Initialize a learnable Gaussian distribution layer.

Parameters:

Name	Type	Description	Default
`shape`	`tuple[int, ...]`	Shape of the learnable parameters.	required
`mu_prior`	`float`	Mean of the Gaussian prior.	`0.0`
`std_prior`	`float`	Standard deviation of the prior.	`0.1`
`mu_init`	`float`	Initial value for the learnable mean.	`0.0`
`rho_init`	`float`	Initial value for the learnable rho.	`-7.0`
`**kwargs`	`Any`	Extra arguments passed to the base class.	`{}`

Returns:

Type	Description
`None`	None

Source code in illia/distributions/tf/gaussian.py

def __init__(
    self,
    shape: tuple[int, ...],
    mu_prior: float = 0.0,
    std_prior: float = 0.1,
    mu_init: float = 0.0,
    rho_init: float = -7.0,
    **kwargs: Any,
) -> None:
    """
    Initialize a learnable Gaussian distribution layer.

    Args:
        shape: Shape of the learnable parameters.
        mu_prior: Mean of the Gaussian prior.
        std_prior: Standard deviation of the prior.
        mu_init: Initial value for the learnable mean.
        rho_init: Initial value for the learnable rho.
        **kwargs: Extra arguments passed to the base class.

    Returns:
        None
    """

    super().__init__(**kwargs)

    self.shape = shape
    self.mu_prior_value = mu_prior
    self.std_prior_value = std_prior
    self.mu_init = mu_init
    self.rho_init = rho_init

    self.build(self.shape)

7.2.3 `log_prob(x=None)`

Compute the log-probability of a given sample. If no sample is provided, one is drawn internally.

Parameters:

Name	Type	Description	Default
`x`	`Optional[Tensor]`	Optional input sample to evaluate. If None, a new sample is drawn from the distribution.	`None`

Returns:

Type	Description
`Tensor`	tf.Tensor: Scalar log-probability value.

Notes

Supports both user-supplied and internally drawn samples.

Source code in illia/distributions/tf/gaussian.py

def log_prob(self, x: Optional[tf.Tensor] = None) -> tf.Tensor:
    """
    Compute the log-probability of a given sample. If no
    sample is provided, one is drawn internally.

    Args:
        x: Optional input sample to evaluate. If None,
            a new sample is drawn from the distribution.

    Returns:
        tf.Tensor: Scalar log-probability value.

    Notes:
        Supports both user-supplied and internally drawn
        samples.
    """

    # Sample if x is None
    if x is None:
        x = self.sample()

    # Define pi variable
    pi: tf.Tensor = tf.convert_to_tensor(math.pi)

    # Compute log priors
    log_prior = (
        -tf.math.log(tf.math.sqrt(2 * pi))
        - tf.math.log(self.std_prior)
        - (((x - self.mu_prior) ** 2) / (2 * self.std_prior**2))
        - 0.5
    )

    # Compute sigma
    sigma: tf.Tensor = tf.math.log1p(tf.math.exp(self.rho))

    # Compute log posteriors
    log_posteriors: tf.Tensor = (
        -tf.math.log(tf.math.sqrt(2 * pi))
        - tf.math.log(sigma)
        - (((x - self.mu) ** 2) / (2 * sigma**2))
        - 0.5
    )

    # Compute final log probs
    log_probs = tf.math.reduce_sum(log_posteriors) - tf.math.reduce_sum(log_prior)

    return log_probs

7.2.4 `sample()`

Draw a sample from the Gaussian distribution.

Returns:

Type	Description
`Tensor`	tf.Tensor: A sample drawn from the distribution.

Source code in illia/distributions/tf/gaussian.py

def sample(self) -> tf.Tensor:
    """
    Draw a sample from the Gaussian distribution.

    Returns:
        tf.Tensor: A sample drawn from the distribution.
    """

    # Sampling with reparametrization trick
    eps: tf.Tensor = tf.random.normal(shape=self.rho.shape)
    sigma: tf.Tensor = tf.math.log1p(tf.math.exp(self.rho))

    return self.mu + sigma * eps

7. Distributions

7.1 DistributionModule

7.1.1 num_params abstractmethod property

7.1.2 log_prob(x=None) abstractmethod

7.1.3 sample() abstractmethod

7.2 GaussianDistribution

7.2.1 num_params property

7.2.2 __init__(shape, mu_prior=0.0, std_prior=0.1, mu_init=0.0, rho_init=-7.0, **kwargs)

7.2.3 log_prob(x=None)

7.2.4 sample()

7.1 `DistributionModule`

7.1.1 `num_params` `abstractmethod` `property`

7.1.2 `log_prob(x=None)` `abstractmethod`

7.1.3 `sample()` `abstractmethod`

7.2 `GaussianDistribution`

7.2.1 `num_params` `property`

7.2.2 `init(shape, mu_prior=0.0, std_prior=0.1, mu_init=0.0, rho_init=-7.0, **kwargs)`

7.2.3 `log_prob(x=None)`

7.2.4 `sample()`