Hello,
I am keep getting an impression that the diagonal normal guide would perform better than the delta guide for most datasets. Under what circumstances is the usage of Delta Guide preferred over DiagnalNormal Guide?
Thank you,
Hello,
I am keep getting an impression that the diagonal normal guide would perform better than the delta guide for most datasets. Under what circumstances is the usage of Delta Guide preferred over DiagnalNormal Guide?
Thank you,
this is very problem specific. a delta guide doesn’t give you any parameter uncertainty (you just get a point estimate). the AutoDiagonalNormal
guide will give you some parameter uncertainty, but it also potentially makes the optimization problem more difficult. so there’s no simple answer here.
Is there any website or other resource that discusses this topic? Thank you,
i don’t know of any good resource for these kinds of things. you could read something like “Pattern Recognition and Machine Learning” by Christopher M. Bishop for an introduction to probabilistic machine learning and try to build intuition that way
Hello,
Thank you very much for your reply.
I’d hate to keep bug you on this, but is there any publication that is linked to the AutoLaplaceApproximation
guide that comes with Pyro
?
http://docs.pyro.ai/en/stable/_modules/pyro/infer/autoguide/guides.html#AutoLaplaceApproximation
Thank you,
“Statistical rethinking” textbook has a nice introduction into this (aka quadratic approximation, see e.g. chapter 2) and other basic methods:
https://xcelab.net/rm/statistical-rethinking/
(with codes also available in numpyro)
In practice when I am creating new models, I often implement both Delta
and AutoNormal
guides. The Delta
guides tend to converge more quickly and more robustly. Once I can get a Delta
guide to train, I’ll switch to an AutoNormal
guide with more training steps, lower learning rate. After AutoNormal
, I’ll often switch again to an AutoLowRankMultivariateNormal
with even slower learning rate and more steps. I find Delta
is good for fast model iteration and a good sanity check that I can learn a decent point estimate, before I start modeling uncertainty.
Hello,
When you implement both the AutoNormal
and Delta
guides, does the model with a AutoNormal
guide usually perform considerably better then the model with Delta
guide? This is what is happening to me right now, and I am assuming this is because the Delta
guide assigns all probabilities to a single value?
@h56cho it depends what you mean by “better”. Indeed the AutoDelta
guide provides simply a single point estimate (corresponding to MAP inference). If you want any sort of uncertainty estimate at all, you’ll need to use something like AutoNormal
. If all you want is a point estimate, then AutoDelta
can sometimes “perform better” in the sense that it is more robust and converges more quickly.