I need to implement a model where the conditional likelihood itself is a intractable integral (this is essential to my problem), that is,
p(x|z) = int p(x|r)p(r|z)dr where
int represent the integral. And in my problem, this is intractable.
I was discussing this with @fehiepsi in another post the other day. The most straightforward idea I guess is to use a MC estimator to estimate the likelihood itself. Say we will use 10 samples of
r ~ p(r|z) to estimate the likelihood, I have been implementing it like this,
first define an unscaled model, where
r ~ p(r|z) is sampled using
torch.distributions rather than
pyro.sample so it would not be treated like a latent variable. When writing the
pyro.sample statement for the observable, I expand the data in a new axis to 10, and basically calculate
sum_r p(x|r) as a bulk. After that, use
poutine.scale to scale it to 0.1.
Today however I realize this is wrong, the mistake I made is that when
q(z) is reparameterized, we need to calculate
d log p(x|z)/dz, and the gradient of expectation is of course, not the expectation of the gradient. In fact, the expectation of the gradient in this case, using a MC estimator for the likelihood,
E[d log p(x|r)/dz] is zero, certainly not what I want.
In addition to that, MC estimator is actually not a good idea for estimating that likelihood in my problem, I would rather, say use numerical integration for this likelihood. However, I am not sure in this case how can I write the
pyro.sample statement for the observable. Essentially, I think this requires defining a type of distribution where the
.log_prob are allowed to be computed through numerical approximation.
Any suggestions on how to implement these things?