In SVI tutorial part 1, it says that the distributions for the the latent variable to be aligned in `model`

and `guide`

may be different. For example,

```
def model():
pyro.sample("z_1", dist.Beta(torch.tensor(10.0), torch.tensor(10.0))) # distribution 1
def guide():
pyro.sample("z_1", dist.Beta(torch.tensor(15.0), torch.tensor(15.0))) # distribution 2
```

My understanding is that, distribution 2 `dist.Beta(torch.tensor(15.0), torch.tensor(15.0))`

is the initial distribution for `q(z)`

which will be iteratively tuned in the SVI optimization process. What is the use for distribution 1 `dist.Beta(torch.tensor(10.0), torch.tensor(10.0))`

? Thanks.

SVI optimizes the ELBO objective and the first distribution corresponds to the `p(x, z)`

term in the ELBO, while the second one corresponds to `q(z)`

, but both are needed to construct the objective. Put another way, the model is what you are really interested in, but since doing exact inference to compute the posterior `p(z|x)`

isn’t feasible, you use a variational family `q(z)`

to approximate this posterior and optimize the variational parameters for `q`

to get close to `p`

(i.e. minimize the KL divergence of `p`

from `q`

.)

I am not sure if Pyro’s tutorials are best suited to get started with Variational Inference, but you’ll find plenty of other good tutorials to get started (e.g. Eric Jang: A Beginner's Guide to Variational Methods: Mean-Field Approximation). I would suggest going through them to develop a good understanding and intuition for VI, and then come back to the introductory tutorials to understand how it works in Pyro.

1 Like