Why NUTS shows much better result than SVI on simple case?

this discussion is relevant.

you can’t use .long() in your model in conjuction with a gradient-based inference algorithm like SVI or HMC; this op blocks gradient flow.

1 Like