General implementation of Metropolis-Hastings


#1

Hi all,

I’m really enjoying playing around with Pyro, but I have a few questions. I see that Pyro now has NUTS and HMC, which is awesome, but I often end up making models with some discrete latents (and variable numbers of discrete latents). I’d like to sample them using some form of MH. I know there is an open PR to implement single-site MH, but I was wondering about implementing some more general form of MH (e.g. including the RJ correction for introduction of new variables/deletion of variables). Anyway:

  • I really like the way that Pyro uses models and guides, and the freedom it provides. From my read of the docs, it seems like a guide is not just useful as a variational distribution, but (in principle) could also be an MCMC proposal distribution. So I thought to implement an MH kernel using a guide as a proposal distribution. If I’m doing independent MH, there is no further question. But usually I want to condition my proposal on the current value(s) of the relevant sample site(s)–what’s the best way to do this? The guide and model have by definition the same signature, so I can’t really pass in the current value via the arguments to the guide. Is there some other practical way to do this that would fit nicely in the design of Pyro?

  • Regarding a more general MH that allows for transdimensional proposals: I’ve implemented these in more specific settings before, but one thing I was wondering is how in general the inference algorithm would “know” what sorts of variables it could introduce. I imagined this would be solved by taking an approach like above–use a guide that defines what sorts of variables could be proposed (leading back to the first question). Is there another reasonable way to do this? I’d imagine not, since running the model forward would sometimes not introduce sample sites that you may want to propose…

If I get a good intuition for the right way to go here, I’d be more than happy to contribute to Pyro if you’d like. Also, if this belongs on the issue tracker instead of the forum, just let me know.

Thanks for your time

Patrick


#2

Hi, you’ve pretty much hit on the reason we don’t provide a generic MH implementation. Here’s a (very) old PR where we considered introducing a generic version and ultimately decided against it; the MH code is here. The loop body of the MH code, where the proposal is invoked and the acceptance ratio is calculated, should answer your first question.

Basically, your second point is a theoretical show-stopper without additional constraints on the model or proposal. Coming up with a completely general-purpose reversible jump correction scheme is a subject of active research, as is modifying the language, algorithm or theoretical justification to avoid the need for such a thing. I’m not completely up to date with those lines of research, but perhaps one promising idea that could be adapted to this context is to build static analysis tools for distinguishing between “structural” random choices that influence control flow and “non-structural” ones that don’t, as in e.g. this paper.

Personally, I think it would still be OK to have MH with user-specified proposals, but we don’t have any use cases to justify implementing the algorithm and the attendant RJ corrections and/or warning+validation logic ourselves. We tend to use Pyro’s enumeration features to handle discrete variables instead.

Definitely feel free to open an issue or PR if you have any implementation ideas, though!


#3

Thanks for the quick response!

Hi, you’ve pretty much hit on the reason we don’t provide a generic MH implementation. Here’s a (very) old PR where we considered introducing a generic version and ultimately decided against it; the MH code is here.

Ah, ok, thanks. That clears things up for the first question.

Basically, your second point is a theoretical show-stopper without additional constraints on the model or proposal. Coming up with a completely general-purpose reversible jump correction scheme is a subject of active research, as is modifying the language, algorithm or theoretical justification to avoid the need for such a thing.

Ok, this is what I was coming to think. It’s quite a fascinating problem, though.

I’m not completely up to date with those lines of research, but perhaps one promising idea that could be adapted to this context is to build static analysis tools for distinguishing between “structural” random choices that influence control flow and “non-structural” ones that don’t, as in e.g. this paper.

Thanks for the reference! Looks super interesting. I’ll have a read through this weekend.

Personally, I think it would still be OK to have MH with user-specified proposals, but we don’t have any use cases to justify implementing the algorithm and the attendant RJ corrections and/or warning+validation logic ourselves.

Understood.

Another thought that I had (again, this is a common use case for me but I realize not for very many others) was an RJMC sampler that uses something like NUTS or HMC for within-model sampling (same caveat as above, no stochastic introduction of new variables within models), but then jumps between a user-specified list of models with either user-specified cross-model proposals or proposals from the prior of the new variables (which would probably be terrible, but maybe sometimes would work). I just wasn’t sure what the right way to go about this is. I imagine that passing a list of models to a sampler is not really idiomatic Pyro, but short of solving the problem mentioned above, I’m not sure what the right way to do this would be.

I’ve been busy with some other stuff, but I can try taking a stab at some of things things in a more restricted setting this weekend to fix ideas and see if there’s anything useful for you and the rest of the community.