Hi, glancing at your code I can see a few issues:
- You don’t need any observed sites in your guide - you can remove the final loop entirely
- You need to make sure parameter values aren’t shared in your model and guide:
qw0_loc = pyro.param("qw0_loc", lambda: params['w0_loc'].clone())
- You don’t need to
softplus
your scale parameters if you’ve already constrained them to be positive viaconstraints.positive
- Bayesian neural networks are sensitive to initialization just like regular neural networks, you might try narrowing your prior and initial variational distributions e.g. by rescaling all the initial
scale
parameters by 0.5 - Your model would run much faster if you vectorized over data with
pyro.plate
rather than using afor
loop
See this other recent topic for pointers to BNN examples in Pyro.