Negative scale in affine transform using AutoNormal

shatfield · May 10, 2022, 3:09pm

Hello,

I am training a Hierarchical Bayesian model in numpyro using SVI. As part of the model definition I have a variable mu_product_sigma defined by

mu_product_sigma = numpyro.sample( "mu_product_sigma", dist.HalfNormal(scale=1.0) )

This is later used as the scale parameter in an affine transform like so:
mu_product = numpyro.sample("mu_product",dist.TransformedDistribution(dist.Normal(loc=jnp.zeros((n_products,)), scale=1.0),dist.transforms.AffineTransform(attr_product @ mu_product_beta, mu_product_sigma),),)

As a guide we use AutoNormal.

After training, I am examining the learned parameters and see that mu_product_sigma_auto_loc = -0.9741603

Initially, I was confused as to why there was a loc at all for mu_product_sigma which is a HalfNormal distribution but I believe this is due to the AutoNormal assuming a NormalDistribution for all parameters, and what I am seeing is the mean of that distribution. Is that correct?

Furthermore, I am unsure how mu_product_sigma_auto_loc can take a negative value and not cause an issue (I can generate forecasts fine) given that it will then go into an AffineTransform as the scale parameter which is assumed to be > 0 according to the documentation. For reference mu_product_sigma_auto_scale = 0.00923638. I guess I do not understand what “When scale is a JAX tracer, we always assume that scale > 0 when calculating codomain.” means in the documentation for AffineTransform.

Thanks in advance for any guidance! Sam

martinjankowiak · May 10, 2022, 7:45pm

pyro uses pre-built transforms to map back and forth between unconstrained euclidean space and transformed spaces. for a HalfNormal latent variable that will be an exponential transform. so in your case mu_product_sigma_auto_loc essentially corresponds to the loc parameter of a LogNormal distribution (which is why it can be negative)