Hello,

I am keep getting an impression that the diagonal normal guide would perform better than the delta guide for most datasets. Under what circumstances is the usage of Delta Guide preferred over DiagnalNormal Guide?

Thank you,

Hello,

I am keep getting an impression that the diagonal normal guide would perform better than the delta guide for most datasets. Under what circumstances is the usage of Delta Guide preferred over DiagnalNormal Guide?

Thank you,

this is very problem specific. a delta guide doesn’t give you any parameter uncertainty (you just get a point estimate). the `AutoDiagonalNormal`

guide will give you some parameter uncertainty, but it also potentially makes the optimization problem more difficult. so there’s no simple answer here.

Is there any website or other resource that discusses this topic? Thank you,

i don’t know of any good resource for these kinds of things. you could read something like “Pattern Recognition and Machine Learning” by Christopher M. Bishop for an introduction to probabilistic machine learning and try to build intuition that way

Hello,

Thank you very much for your reply.

I’d hate to keep bug you on this, but is there any publication that is linked to the `AutoLaplaceApproximation`

guide that comes with `Pyro`

?

http://docs.pyro.ai/en/stable/_modules/pyro/infer/autoguide/guides.html#AutoLaplaceApproximation

Thank you,

“Statistical rethinking” textbook has a nice introduction into this (aka quadratic approximation, see e.g. chapter 2) and other basic methods:

https://xcelab.net/rm/statistical-rethinking/

(with codes also available in numpyro)

In practice when I am creating new models, I often implement both `Delta`

and `AutoNormal`

guides. The `Delta`

guides tend to converge more quickly and more robustly. Once I can get a `Delta`

guide to train, I’ll switch to an `AutoNormal`

guide with more training steps, lower learning rate. After `AutoNormal`

, I’ll often switch again to an `AutoLowRankMultivariateNormal`

with even slower learning rate and more steps. I find `Delta`

is good for fast model iteration and a good sanity check that I can learn a decent point estimate, before I start modeling uncertainty.

Hello,

When you implement both the `AutoNormal`

and `Delta`

guides, does the model with a `AutoNormal`

guide usually perform considerably better then the model with `Delta`

guide? This is what is happening to me right now, and I am assuming this is because the `Delta`

guide assigns all probabilities to a single value?

@h56cho it depends what you mean by “better”. Indeed the `AutoDelta`

guide provides simply a single point estimate (corresponding to MAP inference). If you want any sort of uncertainty estimate at all, you’ll need to use something like `AutoNormal`

. If all you want is a point estimate, then `AutoDelta`

can sometimes “perform better” in the sense that it is more robust and converges more quickly.