Definitions of "inference" presented in tutorials

tiger · April 7, 2018, 9:38pm

Hi! I’m going through the Pyro tutorials and I’m wrestling with this snippet from (DEPRECATED) An Introduction to Inference in Pyro — Pyro Tutorials 1.8.6 documentation

In its most general formulation, inference in a universal probabilistic programming language like Pyro is the problem of constructing this marginal distribution given an arbitrary boolean constraint so that we can perform these computations. The constraint can be a deterministic function of the return value, the internal randomness, or both.

Bayesian inference or posterior inference is an important special case of this more general formulation that admits tractable approximations. In Bayesian inference, the return value is always the values of some subset internal sample statements, and the constraint is an equality constraint on the other internal sample statements. Much of modern machine learning can be cast as approximate Bayesian inference and expressed succinctly in a language like Pyro.

I’m coming to Pyro after reading Bayesian Methods for Hackers, where the presented definition of Bayesian inference was: “updating your beliefs after considering new evidence.”. I’m having trouble reconciling these two worlds now. Specifically:

How is this “arbitrary boolean constraint” used in the construction of the marginal distribution? Boolean implies some type of binary decisionmaking process, is that the case?
How is the “subset of internal sample statements” in the Bayesian inference separated from “the other internal sample statements”? What exactly is being compared between these two sets of samples?

Hope those make at least some sense. Any attempts at alternate explanations or pointers to helpful resources would be appreciated, thanks!

fritzo · April 7, 2018, 11:07pm

Hi @tiger, I believe the section you’re referring to is trying to make a distinction between two types of “new evidence” you could use to update your beliefs. One type is hard evidence like “there are exactly 888,888 people living in San Francisco” or “this glass contains 0.400000000 litres of water”. Another type of evidence is approximate evidence like “there are about 8.9e5 += 1e4 people living in San Francisco (Poisson distributed)” or “this glass contains 0.4 += 0.05 litres of water (Log normally distributed)”. Pyro actually only implements the latter soft observations, i.e. observations must be approximate. This is why pyro observe statements are always associated with a sample statement:

pyro.sample("census", dist.Poisson(1.0 / true_population),
            obs=torch.tensor(8.9e5))

hope that helps!

tiger · April 8, 2018, 1:11am

@fritzo aha, thank you so much for the explanation. That makes way more sense.

tiger · April 9, 2018, 6:55pm

Coming back to this after sleeping on it. Still having trouble building a mental model for this “arbitrary boolean constraint” – where does it come from, and how does it help construct the marginal distribution? Hand-wavily, I want to say that it acts as some sort of filter to determine whether a sample is in or out of the marginal distribution, but that could be totally off-base.

eb8680 · April 9, 2018, 9:40pm

This formulation is taken from the rejection sampling definition of conditional probability in Church (and its successor webPPL). See this chapter of the online textbook Probabilistic Models of Cognition for more on this perspective.