[Beginners] Finding Conditional Probabilities on different patameters

karnny123 · June 26, 2020, 11:31am

Hi guys, I’m very new to pyro and PPL. I have a question on how to make the pyro generate the conditional probability.
So, I’m trying to replicate this

int the terms of pyro code. I have found a thread which explained how to do it here

So I was following the code provided by fehiepsi and it seems going well

I was able to generate the conditional probability of P(grasswet|rain,sprinkler). But when I try to generate P(rain|grasswet,sprinkler) , the error occurs.

Here are the code I used to generate P(grasswet|rain,sprinkler)

import torch
import pyro
import pyro.distributions as dist

@pyro.infer.config_enumerate
def model(rain = None,sprinkler=None,grasswet=None):
    rain = pyro.sample('rain', dist.Bernoulli(0.2))
    sprinkler = pyro.sample('sprinkler',dist.Bernoulli(0.4)) if rain.bool() else 
    pyro.sample('sprinkler',dist.Bernoulli(0.01))
    if (sprinkler == 0. and rain == 0.):
        pyro.sample('grasswet',dist.Bernoulli(0.), obs = grasswet)
    elif (sprinkler == 0. and rain == 1.):
        pyro.sample('grasswet',dist.Bernoulli(0.8), obs = grasswet)
    elif (sprinkler == 1. and rain == 0.):
        pyro.sample('grasswet',dist.Bernoulli(0.9), obs = grasswet)
    else: 
        pyro.sample('grasswet',dist.Bernoulli(0.99), obs = grasswet)

This above chunk defines the model

p_r_s_g = pyro.do(model, data={'rain': torch.tensor(0.),'sprinkler': torch.tensor(0.)})
p_r_s_g_enum = pyro.poutine.enum(p_r_s_g, first_available_dim=-1)
trace = pyro.poutine.trace(p_r_s_g_enum).get_trace( grasswet=torch.tensor(0.) )
trace1 = pyro.poutine.trace(p_r_s_g_enum).get_trace( grasswet=torch.tensor(1.) )
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
trace, has_enumerable_sites=True, max_plate_nesting=1)
print("p(grasswet=F|rain=F,sprinkler=F):", log_prob_evaluate.log_prob(trace).exp())
print("p(grasswet=T|rain=F,sprinkler=F):", log_prob_evaluate.log_prob(trace1).exp())

This chunk generates the conditional probability value.

But when I try to change the parameters, the error occurs.
The code is below

p_r_s_g = pyro.do(model, data={'grasswet': 
torch.tensor(0.),'sprinkler':torch.tensor(0.)})
p_r_s_g_enum = pyro.poutine.enum(p_r_s_g, first_available_dim=-1)
trace = pyro.poutine.trace(p_r_s_g_enum).get_trace( rain =torch.tensor(0.) )
trace1 = pyro.poutine.trace(p_r_s_g_enum).get_trace( rain =torch.tensor(1.) )
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
trace, has_enumerable_sites=True, max_plate_nesting=1)
print("p(rain=F|grasswet=F,sprinkler=F):", log_prob_evaluate.log_prob(trace).exp())
print("p(rain=T|grasswet=F,sprinkler=F):", log_prob_evaluate.log_prob(trace1).exp())

The error said

bool value of Tensor with more than one value is ambiguous
Trace Shapes:
Param Sites:
Sample Sites:
rain dist |
value 2 |

I am thinking if this has to do with Plate and enumeration as mentioned here but I don’t know why this if-else wouldn’t work.

dreamerlzl · June 26, 2020, 2:37pm

My guess is that under enumeration, rain is a bool tensor instead of a single bool value, and the if elif else you use may fail on this. E.g.

import torch
a = torch.tensor([0, 1]).bool()
if a:
    print('test')

will print the same error.

karnny123 · June 26, 2020, 3:00pm

Thanks for pointing that out. There’s definitely something’s wrong with the if-else.
I tried adding

print(rain,sprinkler,grasswet)

at the end of model and print it out, it results

tensor(0.) tensor(0.) None

The type of rain is torch.floatTensor.
I itried rain == 1. , but it still generates the same error
Seems like the grasswet was never actually calculated.
Should the if-else be fixed, does pyro able to trace back the if-else?
Am I misunderstanding anything?

fehiepsi · June 26, 2020, 9:38pm

Hi @karnny123 and @dreamerlzl, for many situations, you can compute conditional distributions without having to leverage enumerate mechanism. For example (I rewrote the model a bit to make enumerate possible for later usage - you can use the topic’s implementation here),

import math
import torch
import pyro
import pyro.distributions as dist

def model(rain=None, sprinkler=None, grasswet=None):
    rain = pyro.sample('rain', dist.Bernoulli(0.2), obs=rain)
    sprinkler_probs = 0.4 * rain + 0.01 * (1 - rain)
    sprinkler = pyro.sample('sprinkler',dist.Bernoulli(sprinkler_probs), obs=sprinkler)
    grasswet_probs = 0. * (1 - sprinkler) * (1 - rain) + 0.8 * (1 - sprinkler) * rain \
        + 0.9 * sprinkler * (1 - rain) + 0.99 * sprinkler * rain
    pyro.sample('grasswet', dist.Bernoulli(grasswet_probs), obs=grasswet)

p_grasswet = pyro.poutine.block(model, hide=['rain', 'sprinkler'])
F, T = torch.tensor(0.), torch.tensor(1.)

print("p(grasswet=F|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, F).log_prob_sum().exp().item())
print("p(grasswet=T|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, T).log_prob_sum().exp().item())

which returns

p(grasswet=F|rain=F,sprinkler=F): 0.9999998807907104
p(grasswet=T|rain=F,sprinkler=F): 1.192093463942001e-07

Now, to compute p(rain=F|grasswet=F,sprinkler=F), we need to compute p(rain=F,grasswet=F,sprinkler=F) and p(grasswet=F,sprinkler=F).

For the former, we can do

p_rain_F_sprinkler_F_grasswet_F = pyro.poutine.trace(model).get_trace(F, F, F).log_prob_sum().exp()
print("p(rain=F,sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F)

For the latter, we need to marginalize rain latent variable. This is the place we will use enumerate mechanism:

enum_model = pyro.infer.config_enumerate(model)
guide = lambda **kwargs: None
elbo = pyro.infer.TraceEnum_ELBO(max_plate_nesting=0)
p_sprinkler_F_grasswet_F = math.exp(-elbo.loss(enum_model, guide, sprinkler=F, grasswet=F))

Finally,

print("p(rain=F|sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F / p_sprinkler_F_grasswet_F)

gives you the answer

p(rain=F|sprinkler=F,grasswet=F): tensor(0.9706)

TraceEnumELBO has a convenient method compute_marginals to combine the above two steps for you

conditional_marginals = elbo.compute_marginals(enum_model, guide, sprinkler=F, grasswet=F)
print("p(rain=T|sprinkler=F,grasswet=F):", conditional_marginals["rain"].log_prob(T).exp())

which returns

p(rain=T|sprinkler=F,grasswet=F): tensor(0.0294)

Hope this helps!

karnny123 · June 29, 2020, 5:07am

Wow, that’s gorgeous. It really helps

Thank you very much @fehiepsi .

Just one more thing. Do you have any recommendation on where I can learn-understand about enumerate/TraceEnumELBO? Is there any mathematical knowledge I should know before working with pyro?

Bart · July 1, 2020, 5:30pm

Believe there might be something wrong with the examples above as the result do not seem to look right.
Running the calculations using an approach presented previously on forum:

p_z_x1 = pyro.condition(model, data={'sprinkler': torch.tensor(0.),'grasswet': torch.tensor(0.)})
p_z_x1_enum = pyro.poutine.enum(p_z_x1, first_available_dim=-1)

trace = pyro.poutine.trace(p_z_x1_enum).get_trace(rain=torch.tensor(0.))
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
    trace, has_enumerable_sites=True, max_plate_nesting=0)
print("p(R=0|S=0,G=0):", log_prob_evaluate.log_prob(trace).exp())

trace1 = pyro.poutine.trace(p_z_x1_enum).get_trace(rain=torch.tensor(1.))
log_prob_evaluate = pyro.infer.mcmc.util.TraceEinsumEvaluator(
    trace1, has_enumerable_sites=True, max_plate_nesting=0)
print("p(R=1|S=0,G=0):", log_prob_evaluate.log_prob(trace1).exp())

yields:

p(R=0|S=0,G=0): tensor(0.7920)
p(R=1|S=0,G=0): tensor(0.0240)

instead of

p(rain=F|sprinkler=F,grasswet=F): tensor(0.9706)
p(rain=T|sprinkler=F,grasswet=F): tensor(0.0294)

Calculations are straightforward for P(R=0, S=0, G=0) where the result should be 0.8x0.6x1.0=0.48 but the code:

p_rain_F_sprinkler_F_grasswet_F = pyro.poutine.trace(model).get_trace(F, F, F).log_prob_sum().exp()
print("p(rain=F,sprinkler=F,grasswet=F):", p_rain_F_sprinkler_F_grasswet_F)

returns:
p(rain=F,sprinkler=F,grasswet=F): tensor(0.7920)

Can someone could have a look please?

karnny123 · July 1, 2020, 11:25pm

Yeah, thank you for pointing that out @Bart . I believe the model I created was flawed and should the model suggested by @fehiepsi instead. Also, the

sprinkler_prob

Should be changed to

sprinkler_probs = 0.4 * (1-rain) + 0.01 * rain

Bart · July 2, 2020, 7:18am

Thanks @karnny123, was checking the code and did not spot this. Results now look ok.

fehiepsi · July 3, 2020, 2:20am

Hi @karnny123, the enumeration tutorial and the forum are the best resources that I know (probably because I don’t have much background for these types of model). Many ppl programmers do marginalization by hand (working with log_prob directly, using logsumexp, and manually arrange dimensions…), which is easily getting tricky when the model is getting more complicated but might be a good place to start with to understand the underlying calculations.

karnny123 · July 30, 2020, 6:45am

Hi, @fehiepsi, I would like to reopen the discussion.

On the enumeration model, now that I was able to get the conditional probability value given two parameters. Now, I would like to get probability value given one parameter i.e.

p(sprinkler=F|rain=F)

I tried to hide the grasswet using poutine.block and enumerates it as the code below

model_no_grass = pyro.poutine.block(model, hide = [‘grasswet’])
enum_model2 = pyro.infer.config_enumerate(model_no_grass)

and when I’m trying to use the compute_marginal function,

conditional_marginals = elbo.compute_marginals(enum_model2, guide,rain = F)
print(“p(sprinkler=F|rain=F):”, conditional_marginals[“sprinkler”].log_prob(F).exp())

Given that I’ve hide grasswet, I think I should only input rain. However, I got an error which said

Number of einsum subscripts must be equal to the number of operands.

This error also occurs when I input grasswet = T or grasswet = F in there as well.
Could you please help me with this? Thank you.

fehiepsi · July 31, 2020, 12:56am

I think you want to compute likelihood, rather than “conditional”:

math.exp(-elbo.loss(enum_model2, guide, rain=F, sprinkler=F))

karnny123 · July 31, 2020, 1:58am

Thank you

dksahuji · June 18, 2024, 12:34pm

fehiepsi:

p_grasswet = pyro.poutine.block(model, hide=['rain', 'sprinkler'])
F, T = torch.tensor(0.), torch.tensor(1.)

print("p(grasswet=F|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, F).log_prob_sum().exp().item())
print("p(grasswet=T|rain=F,sprinkler=F):",
      pyro.poutine.trace(p_grasswet).get_trace(F, F, T).log_prob_sum().exp().item())

This part is probably not correct because the following code doesn’t match with other outputs in the same comment.

p_rain = pyro.poutine.block(model, hide=[‘grasswet’, ‘sprinkler’])

print(“p(rain=F|grasswet=F,sprinkler=F):”,
pyro.poutine.trace(p_rain).get_trace(F, F, F).log_prob_sum().exp().item())
print(“p(rain=T|grasswet=F,sprinkler=F):”,
pyro.poutine.trace(p_rain).get_trace(T, F, F).log_prob_sum().exp().item())

This gives:
p(rain=F|grasswet=F,sprinkler=F): 0.7999999523162842
p(rain=T|grasswet=F,sprinkler=F): 0.19999998807907104

But p(rain=F|sprinkler=F,grasswet=F) = 0.9706 from traceEnum or compute_marginals().