Black box variational inference does not require a model to be differentiable since the gradients of ELBO only requires the gradient of log(q(z)). I think it uses the log(p(z,x)/ q(z)) [instantaneous ELBO] as a kind of a reward to weight gradients. However, the objective I see in the pyro SVI part III tutorial involves a gradient of the instantaneous ELBO as well. I checked the proof in the BBVI paper and found that the expectation of this instantaneous ELBO with respect to the guide is zero. Hence the gradient does not have this term at all. Is this properly not used in pyro's SVI? Am I missing something?
Edit: Is it that the loss which pyro implements cover both cases of differentiable and non differentiable model? If the model is non-differentiable, it automatically defaults to zero. How does the differentiability affect the variance of gradients/ performance?