Hi!
I have a BNN that I train with SVI. It works great, but the only problem I have is that when I sample the sigma like this:
sigma = pyro.sample("sigma", dist.Uniform(0, 0.01))
if y != None:
with pyro.plate("data", len(y)):
obs = pyro.sample("obs", dist.Normal(y_hat[:,0], sigma), obs=y)
The sigma always ends up being the upper limit of the uniform distribution. This causes the model not learn correct behaviour if the noise in measurements is lower than the upper limit of the uniform distribution. For example if the noise level is 0.05 and the upper limit is 0.1 the result is just a continuous line.
What could be done to fix this? Is there for example a way to sample the sigma so that it first starts with lower values and then increases when the training continues.
1 Like
if you have a hierarchy of latent variable e.g. p(a)p(b|a)p(c|b) then SVI generically does a bad job of trying to model uncertainty in top-level variables like a, since doing so properly would require learning e.g. an approximate posterior over b that depends on a. but often at least one of b or c is high-dimensional and so mean-field approximations are usually used, in which case it’s very unlikely that you’ll get particularly meaningful or useful uncertainty estimates for a. in such scenarios i think it’s generically a better idea to either treat a as an unregularized point parameter (via pyro.param
) or as a regularized point parameter (via MAP estimation and a Delta
guide).
Just to add this is general problem when trying to learn the noise variance alongside the NN. The simplest solution is usually to learn the (B)NN with a fixed noise variance first, and then train both the BNN and noise. We have fancier solutions for heteroscedastic regression that are equally applicable to homoscedastic, but the simple approach will likely be enough.
Thank you for both @martinjankowiak and @daknowles. I’ll look into these methods.
1 Like