Separating guide and model parameters in svi.step() function to perform aggressive training of inference network


To avoid latent variable collapse when using auto-regressive decoders, I found this easy to implement idea mentioned in the paper. The idea (See Algorithm 1) is to optimize the inference network more aggressively in the first few epochs and then let the normal VAE training take over.

I believe that we can implement it by separating out guide and model parameters in step function in class at this line.

Any pointers on how i can separate out the guide and model parameters in the step function?


I think, i can separate parameters here: pyro/ at dev · pyro-ppl/pyro · GitHub

Do you think its the correct approach?

Sorry, my bad.

Looking at their code: vae-lagging-encoder/ at master · jxhe/vae-lagging-encoder · GitHub

They are using two different optimizers for encoder and decoder parameters. That would be easy to do in Pyro (Customizing SVI objectives and training loops — Pyro Tutorials 1.8.4 documentation)

1 Like