I am aware the obs is in the model rather than guide. and the ELBO has two term, one is likelihood p(y/z) and another is KL(q(z)//p(z)). I want to ask the model basically is the p(z) and there is no trainable parameters inside. how the obs in the model would back propagate to the guide to improve the likelihood. I saw the kernel on TensorFlow, the likelihood is calculated based on the loss between correct label and the output of q(y/z). so, the obs in pyro is p(y/z) or q(y/z)?
I say p is the prior, and q is the approximate distribution