Hi Everyone,
I wanted to hear thoughts about defining an inefficient Gibbs sampler as a composition of pyro models. I will explain what I mean next.
Let’s say we have observed data x and our model has latent variables z1 and z2. I can try to sample from the posterior of the model by the following two strategies:

Write a model describing priors z1~dist(), z2~dist(), and observations as x~dist(z1, z2) .
Sample from posterior P(z1, z2| x) .

Write two models.
M1{ z1~dist(), z2:observed, and observations as x~dist(z1, z2) . }
M2{ z1:observed z2~dist(), and observations as x~dist(z1, z2) . }
Gibbs sample from p(z2| z1, x) then p(z1|z2, x) and iterate this procedure.

Questions:
A) I have a high dimensional case in which it is really inefficient to sample from strategy 1 as it takes forever to converge but I can sample from strategy 2. Do we guess that this is due to the high dimensional feature space and the phase transition that happens in samplers at high dimensions?
B) The second strategy is questionable because I will never know that I am sampling from the true conditional of the model. However, if I determine that the model has converge after n iterations, is there anything wrong with this strategy?

Hi Martin, thanks for getting back to me so quickly. What details would you like to see ? Can you comment at least on point B? what do you think of this as a reasonable strategy?

I am not trying to be vague. You just don’t tell me what do you want to know.

and can you tell me what do you mean with my proposal and motivation being vague? I want to re-write a Gibbs sampler as a composition of pyro models. Is this correct ?

to me “iterate this procedure” is vague. iterate what exactly? are we doing variational inference in the loop? are there indices on z1 and z2 that vary from iteration to iteration? or are some of these quantities fixed? etc

pyro cannot compute exact posterior conditionals for you (which is what gibbs sampling requires) so i’m afraid i can’t follow in detail.

Hi Martin,
ok, let me try to be more clear. I am focusing for now on strategy 2. I will try to write it in pseudocode.

z1=list()
z2=list()
init z1_tilde, z2_tilde
For n_ in np.arange(n_iterations):

HMC1.run(). # The model here is the model in which P(z1) ~ dist() and P(x=given|z1, z2=z2_tilde) . num_warmup=100, num_samples=1. This is an HMC trying to sample from P(z1|z2,x).

z1_tilde = HMC.get_samples()[‘z1’]

z1.append(z1_tilde)

HMC2.run(). # The model here is the model in which P(z2) ~ dist() and P(x=given|z1=z1_tilde, z2) . num_warmup=100, num_samples=1. This is an HMC trying to sample from P(z2|z1,x).

something like that could work but i wouldn’t expect it to perform any better than doing HMC on the entire model in the regular fashion. probably better to try to various tricks to get HMC to work better, integrate out any latent variables that you can analytically, use HMCGibbs if you can compute posterior conditionals analytically for any of the high dimensional latent variables, …