How to implement tempered model in (Num)Pyro

Hi

I am wondering how can I have a tempered posterior using Pyro/Numpyro.
For example: The origin model is p(w, z) = p(w | z) p(z), and I would like to change the likelihood term to p(w | z)^T.

This tempering operation similar to the KL annealing trick adopted in VAE and the cold posterior trick used in BNN training.

Thanks

if the temperature is a fixed parameter that you set then you can simply use poutine.scale:

with pyro.poutine.scale(scale=T):
    pyro.sample("w", ...)

this multiplies the enclosed log_probs by T

What if I want to treat temperature as a local data-dependent latent variable, (e.g. https://arxiv.org/abs/1411.1810), can poutine.scale still help? Or I need to use other methods.

Thanks

i’m not sure it depends. certainly scale can be local. but whether or not it can be treated as e.g. latent will depend. can you be (much) more specific as to what you want to do? e.g. point to a specific objective function and corresponding inference algorithm?

For example, a model like this, in which T_i stands for the temperature

And now I would like to perform MCMC on this model.

Problem solved, I implement a wrapper class of distributions, which would return a scaled version of log likelihood:

class tempered_XXX(dist.XXX):
    def __init__(self, T, *args, **kwargs):
        self.T = T
        super().__init__(*args, **kwargs)
    
    def log_prob(self, value):
        return 1 / self.T * super().log_prob(value)

return 1 / self.T * super().log_prob(value)

This is what scale does. Probably the documentation part scale (float) is not clear and makes you think we need a scalar there. We should change it to: scale (float or ndarray) and mention that its shape should be broadcast-able to the batch shape (i.e. log_prob shape) of each site under its context. If this does what you wanted, could you open a PR to enhance the docs. :slight_smile:

Sure I can do that.