[Beginners] Finding Conditional Probabilities on different patameters

Hi guys, I’m very new to pyro and PPL. I have a question on how to make the pyro generate the conditional probability.
So, I’m trying to replicate this


int the terms of pyro code. I have found a thread which explained how to do it here

So I was following the code provided by fehiepsi and it seems going well

I was able to generate the conditional probability of P(grasswet|rain,sprinkler). But when I try to generate P(rain|grasswet,sprinkler) , the error occurs.

Here are the code I used to generate P(grasswet|rain,sprinkler)

import torch
import pyro
import pyro.distributions as dist

@pyro.infer.config_enumerate
def model(rain = None,sprinkler=None,grasswet=None):
    rain = pyro.sample('rain', dist.Bernoulli(0.2))
    sprinkler = pyro.sample('sprinkler',dist.Bernoulli(0.4)) if rain.bool() else 
    pyro.sample('sprinkler',dist.Bernoulli(0.01))
    if (sprinkler == 0. and rain == 0.):
        pyro.sample('grasswet',dist.Bernoulli(0.), obs = grasswet)
    elif (sprinkler == 0. and rain == 1.):
        pyro.sample('grasswet',dist.Bernoulli(0.8), obs = grasswet)
    elif (sprinkler == 1. and rain == 0.):
        pyro.sample('grasswet',dist.Bernoulli(0.9), obs = grasswet)
    else: 
        pyro.sample('grasswet',dist.Bernoulli(0.99), obs = grasswet)

This above chunk defines the model

p_r_s_g = pyro.do(model, data={'rain': torch.tensor(0.),'sprinkler': torch.tensor(0.)})
p_r_s_g_enum = pyro.poutine.enum(p_r_s_g, first_available_dim=-1)
trace = pyro.poutine.trace(p_r_s_g_enum).get_trace( grasswet=torch.tensor(0.) )
trace1 = pyro.poutine.trace(p_r_s_g_enum).get_trace( grasswet=torch.tensor(1.) )
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
trace, has_enumerable_sites=True, max_plate_nesting=1)
print("p(grasswet=F|rain=F,sprinkler=F):", log_prob_evaluate.log_prob(trace).exp())
print("p(grasswet=T|rain=F,sprinkler=F):", log_prob_evaluate.log_prob(trace1).exp())

This chunk generates the conditional probability value.

But when I try to change the parameters, the error occurs.
The code is below

p_r_s_g = pyro.do(model, data={'grasswet': 
torch.tensor(0.),'sprinkler':torch.tensor(0.)})
p_r_s_g_enum = pyro.poutine.enum(p_r_s_g, first_available_dim=-1)
trace = pyro.poutine.trace(p_r_s_g_enum).get_trace( rain =torch.tensor(0.) )
trace1 = pyro.poutine.trace(p_r_s_g_enum).get_trace( rain =torch.tensor(1.) )
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
trace, has_enumerable_sites=True, max_plate_nesting=1)
print("p(rain=F|grasswet=F,sprinkler=F):", log_prob_evaluate.log_prob(trace).exp())
print("p(rain=T|grasswet=F,sprinkler=F):", log_prob_evaluate.log_prob(trace1).exp())

The error said

bool value of Tensor with more than one value is ambiguous
Trace Shapes:
Param Sites:
Sample Sites:
rain dist |
value 2 |

I am thinking if this has to do with Plate and enumeration as mentioned here but I don’t know why this if-else wouldn’t work.

My guess is that under enumeration, rain is a bool tensor instead of a single bool value, and the if elif else you use may fail on this. E.g.

import torch
a = torch.tensor([0, 1]).bool()
if a:
    print('test')

will print the same error.

1 Like

Thanks for pointing that out. There’s definitely something’s wrong with the if-else.
I tried adding

print(rain,sprinkler,grasswet)

at the end of model and print it out, it results

tensor(0.) tensor(0.) None

The type of rain is torch.floatTensor.
I itried rain == 1. , but it still generates the same error
Seems like the grasswet was never actually calculated.
Should the if-else be fixed, does pyro able to trace back the if-else?
Am I misunderstanding anything?

Hi @karnny123 and @dreamerlzl, for many situations, you can compute conditional distributions without having to leverage enumerate mechanism. For example (I rewrote the model a bit to make enumerate possible for later usage - you can use the topic’s implementation here),

import math
import torch
import pyro
import pyro.distributions as dist

def model(rain=None, sprinkler=None, grasswet=None):
    rain = pyro.sample('rain', dist.Bernoulli(0.2), obs=rain)
    sprinkler_probs = 0.4 * rain + 0.01 * (1 - rain)
    sprinkler = pyro.sample('sprinkler',dist.Bernoulli(sprinkler_probs), obs=sprinkler)
    grasswet_probs = 0. * (1 - sprinkler) * (1 - rain) + 0.8 * (1 - sprinkler) * rain \
        + 0.9 * sprinkler * (1 - rain) + 0.99 * sprinkler * rain
    pyro.sample('grasswet', dist.Bernoulli(grasswet_probs), obs=grasswet)

p_grasswet = pyro.poutine.block(model, hide=['rain', 'sprinkler'])
F, T = torch.tensor(0.), torch.tensor(1.)

print("p(grasswet=F|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, F).log_prob_sum().exp().item())
print("p(grasswet=T|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, T).log_prob_sum().exp().item())

which returns

p(grasswet=F|rain=F,sprinkler=F): 0.9999998807907104
p(grasswet=T|rain=F,sprinkler=F): 1.192093463942001e-07

Now, to compute p(rain=F|grasswet=F,sprinkler=F), we need to compute p(rain=F,grasswet=F,sprinkler=F) and p(grasswet=F,sprinkler=F).

For the former, we can do

p_rain_F_sprinkler_F_grasswet_F = pyro.poutine.trace(model).get_trace(F, F, F).log_prob_sum().exp()
print("p(rain=F,sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F)

For the latter, we need to marginalize rain latent variable. This is the place we will use enumerate mechanism:

enum_model = pyro.infer.config_enumerate(model)
guide = lambda **kwargs: None
elbo = pyro.infer.TraceEnum_ELBO(max_plate_nesting=0)
p_sprinkler_F_grasswet_F = math.exp(-elbo.loss(enum_model, guide, sprinkler=F, grasswet=F))

Finally,

print("p(rain=F|sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F / p_sprinkler_F_grasswet_F)

gives you the answer

p(rain=F|sprinkler=F,grasswet=F): tensor(0.9706)

TraceEnumELBO has a convenient method compute_marginals to combine the above two steps for you

conditional_marginals = elbo.compute_marginals(enum_model, guide, sprinkler=F, grasswet=F)
print("p(rain=T|sprinkler=F,grasswet=F):", conditional_marginals["rain"].log_prob(T).exp())

which returns

p(rain=T|sprinkler=F,grasswet=F): tensor(0.0294)

Hope this helps!

3 Likes

Wow, that’s gorgeous. It really helps

Thank you very much @fehiepsi .

Just one more thing. Do you have any recommendation on where I can learn-understand about enumerate/TraceEnumELBO? Is there any mathematical knowledge I should know before working with pyro?

Believe there might be something wrong with the examples above as the result do not seem to look right.
Running the calculations using an approach presented previously on forum:

p_z_x1 = pyro.condition(model, data={'sprinkler': torch.tensor(0.),'grasswet': torch.tensor(0.)})
p_z_x1_enum = pyro.poutine.enum(p_z_x1, first_available_dim=-1)

trace = pyro.poutine.trace(p_z_x1_enum).get_trace(rain=torch.tensor(0.))
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
    trace, has_enumerable_sites=True, max_plate_nesting=0)
print("p(R=0|S=0,G=0):", log_prob_evaluate.log_prob(trace).exp())

trace1 = pyro.poutine.trace(p_z_x1_enum).get_trace(rain=torch.tensor(1.))
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
    trace1, has_enumerable_sites=True, max_plate_nesting=0)
print("p(R=1|S=0,G=0):", log_prob_evaluate.log_prob(trace1).exp())

yields:

p(R=0|S=0,G=0): tensor(0.7920)
p(R=1|S=0,G=0): tensor(0.0240)

instead of

p(rain=F|sprinkler=F,grasswet=F): tensor(0.9706)
p(rain=T|sprinkler=F,grasswet=F): tensor(0.0294)

Calculations are straightforward for P(R=0, S=0, G=0) where the result should be 0.8x0.6x1.0=0.48 but the code:

p_rain_F_sprinkler_F_grasswet_F = pyro.poutine.trace(model).get_trace(F, F, F).log_prob_sum().exp()
print("p(rain=F,sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F)

returns:
p(rain=F,sprinkler=F,grasswet=F): tensor(0.7920)

Can someone could have a look please?

1 Like

Yeah, thank you for pointing that out @Bart . I believe the model I created was flawed and should the model suggested by @fehiepsi instead. Also, the

sprinkler_prob

Should be changed to

sprinkler_probs = 0.4 * (1-rain) + 0.01 * rain

Thanks @karnny123, was checking the code and did not spot this. Results now look ok.

Hi @karnny123, the enumeration tutorial and the forum are the best resources that I know (probably because I don’t have much background for these types of model). Many ppl programmers do marginalization by hand (working with log_prob directly, using logsumexp, and manually arrange dimensions…), which is easily getting tricky when the model is getting more complicated but might be a good place to start with to understand the underlying calculations.

1 Like

Hi, @fehiepsi, I would like to reopen the discussion.

On the enumeration model, now that I was able to get the conditional probability value given two parameters. Now, I would like to get probability value given one parameter i.e.

p(sprinkler=F|rain=F)

I tried to hide the grasswet using poutine.block and enumerates it as the code below

model_no_grass = pyro.poutine.block(model, hide = [‘grasswet’])
enum_model2 = pyro.infer.config_enumerate(model_no_grass)

and when I’m trying to use the compute_marginal function,

conditional_marginals = elbo.compute_marginals(enum_model2, guide,rain = F)
print(“p(sprinkler=F|rain=F):”, conditional_marginals[“sprinkler”].log_prob(F).exp())

Given that I’ve hide grasswet, I think I should only input rain. However, I got an error which said

Number of einsum subscripts must be equal to the number of operands.

This error also occurs when I input grasswet = T or grasswet = F in there as well.
Could you please help me with this? Thank you.

I think you want to compute likelihood, rather than “conditional”:

math.exp(-elbo.loss(enum_model2, guide, rain=F, sprinkler=F))
1 Like

Thank you :slight_smile: