Beginner question on SVI: how to find parameters of a normal distribution

adk · March 26, 2022, 7:00am

There seems to be a lot of complexity in the undocumented method _unpack_latents(); it’s possible that I didn’t explain my question well. All I would like is a mapping from the names I used in my model’s pyro.sample(“name”, distribution) statements to the corresponding scale,loc pairs in the autoguide. For simplicity let’s say I’m using AutoDiagonalNormal. Is there no straightforward way to get a dictionary like {“name” : (scale,loc), …} for each “name” used in sample sites in the model?

edit: typos

fritzo · March 27, 2022, 3:25pm

If you want a simple mapping from latent variables to parameters in an autoguide, then I’d recommend using the AutoNormal guide. The AutoNormal guide has a simple parameter naming scheme (try printing guide.named_parameters()). The other guides have complex relationships between latent variables and the location and scale parameters. But I have not found a need to explicitly use that complex relationship; you should be able to read off all sufficient statics by drawing samples via guide().

adk · March 28, 2022, 3:32pm

Thanks Fritz! A couple follow up questions about this: guide.named_parameters() seems to return only keys postfixed by “_unconstrained”, and the scale params I observe are negative, and the optimization doesn’t converge. Also, if my understanding is correct for the non-diagonal version, there should be a covariance matrix somewhere where the rows and cols correspond to names of my latents. If that’s correct, how would I find the mapping from latent names to row index?

fritzo · March 30, 2022, 12:00pm

Read the code and focus on places where we use transform_to() and biject_to(). Grep is your friend. Focus on gaining an understanding of the correspondence between constrained and unconstrained spaces. Try to understand why event_shape can differ between constrained and unconstrained values. Read the distributions design doc. Think about nontrivial event shape and how transforms and flattening is required to convert high-rank tensors to the vectors expected in a multivariate normal distribution, and how this would make the “mapping from latent names to row index” complex. As a goal, try to completely understand AutoContinuous._unpack_latent() .