I’m a little bit confused about model side parameters.
In the SVI tutorial, it says
So, for a fixed θ, as we take steps in ϕ space that increase the ELBO, we decrease the KL divergence between the guide and the posterior, i.e. we move the guide towards the posterior. In the general case we take gradient steps in both θ and ϕ space simultaneously so that the guide and model play chase, with the guide tracking a moving posterior logpθ(z|x).
I’m wondering in the general case, what model side parameter value does SVI give us? Is it a MLE on those parameters?
Say, if we got a latent variable in the model, it’s mean and variance was set by two param sites, and in the guide, this latent variable also has it’s mean and variance set by two param sites. In my understanding, the SVI will optimize the guide side params to make the guide converge to the posterior, however, this posterior also depends on model side parameter, then what kind of model side parameter value will SVI give us?