Pyro for Directed Bayesian Network Inference

stone · February 4, 2020, 4:21am

I am new to Pyro. I would like to know if Pyro works for the inference of graphical models. I want to first generate a simple case.

The discrete Bayesian Network has three nodes: A, B, C and all of them are binary variables. The graphical structure is B \leftarrow A \rightarrowC. Hence, the joint probability can be expressed by
P(A, B, C) = P(A)P(B|A)P(C|A). Now, we have some observations of A, B, C, i.e. [0,1,0], [1,1,1],[0,0,1] … The goal is to compute A* = argmax_{A}P(A|B=0).

As we know, A, B, C all have the observations and there are also some dependencies among them. In a traditional way, we need to perform parameter learning of the conditional probability p(A), P(B|A), P(C|A) and then do inference. I am very interested in what Pyro can do!

Thank you so much for any help and attention.

I also wrote some codes but it does not work. ‘RuntimeError: result type Float can’t be cast to the desired output type Int’. I do not know if it is on the right track or not.

def bn(data):
	pa = pyro.sample(‘pa’,dist.Beta(1,1))
	pb0 = pyro.sample(‘pb0’,dist.Beta(1,1))
	pb1 = pyro.sample(‘pb1’, dist.Beta(1,1))
	pc0 = pyro.sample(‘pc0’, dist.Beta(1,1))
	pc1 = pyro.sample(‘pc1’, dist.Beta(1,1))
	for i in pyro.plate(“data_loop”, len(data)):
		A =pyro.sample(‘A_{}’.format(i),dist.Bernoulli(pa), obs = torch.tensor(data[:,0][i],dtype=torch.int))
		if A==0:
			B = pyro.sample(‘B_{}’.format(i),dist.Bernoulli(pb0), obs = torch.tensor(data[:,1][ i])
			C = pyro.sample(‘C_{}’.format(i),dist.Bernoulli(pc0),obs = torch.tensor(data[:,2][i]))
		if A==1:
			B = pyro.sample(‘B_{}’.format(i),dist.Bernoulli(pb1), obs = torch.tensor(data[:,1][i])
			C = pyro.sample(‘C_{}’.format(i),dist.Bernoulli(pc1), obs = torch.tensor(data[:,2][i])
	return B,C

def guide(data):
	p = pyro.param(“p”, torch.randn(5, 2).exp(), constraint=constraints.simplex)
	pa = pyro.sample(‘pa’, dist.Beta(p[0,0],p[0,1]))
	pb0 = pyro.sample(‘pb0’, dist.Beta(p[1,0],p[1,1]))
	pb1 = pyro.sample(‘pb1’, dist.Beta(p[2,0],p[2,1]))
	pc0 = pyro.sample(‘pc0’, dist.Beta(p[3,0],p[3,1]))
	pc1 = pyro.sample(‘pc1’, dist.Beta(p[4,0],p[4,1]))
	return pa, pb0, pb1, pc0, pc1

gdalle · February 4, 2020, 10:07am

Hi!
As for your error, are you sure your data tensor is integer-valued? It seems there is a problem with the conversion, and I would bet it happens on line 7 of your bn function.
Otherwise I think this looks pretty coherent, but I haven’t run the code

stone · February 4, 2020, 4:03pm

Thank you so much for your reply. My data is integer-valued which has numpy int32 format. But this code does not solve the problem A* = argmax_{A}P(A|B=0) even if it works. I would like to know first if it is on the right track or not.

eb8680_2 · February 7, 2020, 8:04pm

Hi, I haven’t run your code but your model seems fine. I would suggest vectorizing over observations and using indexing instead of Python conditionals so that it is compatible with Pyro’s enumeration machinery for discrete variables:

@pyro.infer.config_enumerate
def model(N):
    probs_a = pyro.param("probs_a", ...)
    probs_b = pyro.param("probs_b", ...)
    probs_c = pyro.param("probs_c", ...)
    with pyro.plate("data", N):
        a = pyro.sample("a", dist.Categorical(probs=probs_a))
        b = pyro.sample("b", dist.Categorical(probs=probs_b[..., a]))
        c = pyro.sample("c", dist.Categorical(probs=probs_c[..., a]))
        return a, b, c

I’ve used pyro.params for the conditional probability tables to keep the code short and avoid needing a guide, but you could be Bayesian about them as in your original model. You can use SVI with any ELBO for parameter learning since none of the variables here are latent:

conditioned_model = pyro.condition(model, data={"a": data_a, "b": data_b, "c": data_c})
optim = pyro.optim.SGD()
svi = pyro.infer.SVI(conditioned_model, lambda N: None, pyro.infer.TraceEnum_ELBO(), optim)
for step in range(num_steps):
    svi.step(N)

You can perform inference (i.e. computing p(A | C = 0)) with pyro.infer.TraceEnum_ELBO.compute_marginals:

conditioned_model = pyro.condition(model, data={"b": 0})
marginal_a = TraceEnum_ELBO().compute_marginals(
    conditioned_model, lambda N: None, 1)["a"]
argmaxp = torch.argmax(marginal_a.probs.squeeze())

You might also find it easier or more intuitive to work directly with the conditional probability tables using pyro.ops.contract.einsum for inference (this is what TraceEnum_ELBO does under the hood):

from pyro.ops.contract import einsum

logp_b_0 = dist.Categorical(logits=logp_b).log_prob(0)
logp_a_marginal = einsum("a,a,ac->a", logp_a, logp_b_0, logp_c,
                         backend="pyro.ops.einsum.torch_log")
argmaxlp = torch.argmax(logp)

stone · February 17, 2020, 6:40pm

Sorry for the late reply. Thank you so much and that helps me a lot!

stone · February 18, 2020, 4:30am

Thank you so much and I have tried your codes. I want to post a followup question and hope this will not bother you too much. Here I want to bayesian about the parameters and learn a posterior distribution of the parameters. The goal is to get some samples from the posterior probability distribution of the parameters.

import torch
import pyro
import pyro.distributions as dist

import numpy as np
import matplotlib.pyplot as plt
######### generate data###############

data_A = list(np.random.binomial(1,0.5,5000))
data_B = []
data_C= []
for i in data_A:
    if i==0:
        data_B.append(np.random.binomial(1,0.5,1)[0])
        data_C.append(np.random.binomial(1,0.6,1)[0])
    else:
        data_B.append(np.random.binomial(1,0.75,1)[0])
        data_C.append(np.random.binomial(1,0.25,1)[0])
        
data_A = torch.tensor(data_A)
data_B = torch.tensor(data_B)
data_C = torch.tensor(data_C)
pyro.clear_param_store()

 
@pyro.infer.config_enumerate
def model(N):
    # a1,b1,a2,b2,a3,b3,a4,b4,a5,b5
    beta_param = pyro.param('beta_param', torch.randn(1,10).exp())
    
    # parameter
    probs_a = pyro.sample("probs_a", dist.Beta(beta_param[0][0],beta_param[0][1]))
    probs_a0b = pyro.sample("probs_a0b", dist.Beta(beta_param[0][2],beta_param[0][3]))
    probs_a1b = pyro.sample("probs_a1b", dist.Beta(beta_param[0][4],beta_param[0][5]))
    probs_a0c = pyro.sample("probs_a0c", dist.Beta(beta_param[0][6],beta_param[0][7]))
    probs_a1c = pyro.sample("probs_a1c", dist.Beta(beta_param[0][8],beta_param[0][9]))
    
    probs_b = torch.tensor([[1-probs_a0b, probs_a0b],[1-probs_a1b, probs_a1b]])
    probs_c = torch.tensor([[1-probs_a0c, probs_a0c],[1-probs_a1c, probs_a1c]])
    with pyro.plate("data", N):
        a = pyro.sample("a", dist.Categorical(probs=torch.tensor([1-probs_a,probs_a])))
        b = pyro.sample("b", dist.Categorical(probs=probs_b[a.long(),:]))
        c = pyro.sample("c", dist.Categorical(probs=probs_c[a.long(),:]))
    return a, b, c
def guide(N):
    pass
N=5000
conditioned_model = pyro.condition(model, data={"a": data_A, "b": data_B, "c": data_C})
optim = pyro.optim.Adam({"lr": 0.0004, "betas": (0.90, 0.999)})
svi = pyro.infer.SVI(conditioned_model, guide, loss = pyro.infer.TraceEnum_ELBO(), optim= optim)
losses = []
for step in range(500):
    print(step)
    losses.append(svi.step(N))
plt.plot(losses)

But this code does not provide a good reasonable result comparing to the synthetic data I generated. The losses do not converge. I also have some codes that work well but I am not providing a prior for the parameters. Finally, I can not get a samples of the parameters from the posterior distribution.

@pyro.infer.config_enumerate
def model(N):
    
    probs_a = pyro.param("probs_a", torch.randn(1, 2).exp(), constraint=constraints.simplex)
    probs_b = pyro.param("probs_b", torch.randn(2, 2).exp(), constraint=constraints.simplex)
    probs_c = pyro.param("probs_c", torch.randn(2, 2).exp(), constraint=constraints.simplex)
    with pyro.plate("data", N):
        a = pyro.sample("a", dist.Categorical(probs=probs_a))
        b = pyro.sample("b", dist.Categorical(probs=probs_b[a.long(),:]))
        c = pyro.sample("c", dist.Categorical(probs=probs_c[a.long(),:]))
        return a, b, c
    
def guide(N):
    pass
N=5000
conditioned_model = pyro.condition(model, data={"a": data_A, "b": data_B, "c": data_C})
optim = pyro.optim.Adam({"lr": 0.04, "betas": (0.90, 0.999)})
svi = pyro.infer.SVI(conditioned_model, guide, loss = pyro.infer.TraceEnum_ELBO(), optim= optim)
losses = []
for step in range(500):
    print(step)
    losses.append(svi.step(N))

gbernstein · April 16, 2020, 2:38pm

Hi eb8680_2, could you clarify here what would change if there is a latent variable?

eb8680_2 · April 17, 2020, 11:30pm

could you clarify here what would change if there is a latent variable?

You’d need to provide a guide for the latent variables, as opposed to the empty guide (lambda N: None) in the snippet I was referring to there.

dschneider · May 3, 2022, 3:33pm

eb8680_2:

@pyro.infer.config_enumerate
def model(N):
    probs_a = pyro.param("probs_a", ...)
    probs_b = pyro.param("probs_b", ...)
    probs_c = pyro.param("probs_c", ...)
    with pyro.plate("data", N):
        a = pyro.sample("a", dist.Categorical(probs=probs_a))
        b = pyro.sample("b", dist.Categorical(probs=probs_b[..., a]))
        c = pyro.sample("c", dist.Categorical(probs=probs_c[..., a]))
        return a, b, c

Hi eb8680_2, could you please provide an example of a corresponding guide to this model, when let’s say variable b is latent? Also, should variables with incomplete observations, i.e. some samples have missing values, be considered in the guide or not?

eb8680_2 · May 3, 2022, 7:57pm

@dschneider as the earlier posts in this thread indicate, you can often use Pyro’s enumeration functionality to exactly integrate out latent discrete variables, so there is no need to perform approximate inference with SVI and a guide for b in that particular model. If b were continuous, you could use an autoguide.

Also, should variables with incomplete observations, i.e. some samples have missing values, be considered in the guide or not?

You can use the obs_mask keyword argument to pyro.sample to indicate that a variable is partially observed. As the documentation for pyro.sample says, this introduces an auxiliary pyro.sample site that must be accounted for, whether manually or automatically, by whatever inference method you use.