Using poutine.block vs calculating loss manually leads to different values

mochar · August 2, 2023, 1:20pm

I wish to decompose the loss in contributions of the observed data + the remaining sample sites. That means I need to calculated -1*log_obs_sum of the observed sample site. What works is calculating the loss manually in the train loop:

with torch.no_grad(): 
    model_trace, guide_trace = elbo._get_trace(model, guide, [], params)
    total_elbo = model_trace.log_prob_sum().item() - guide_trace.log_prob_sum().item()
    obs_elbo = model_trace.nodes['obs']['log_prob']
    res_elbo = total_elbo - obs_elbo.sum().item()

Now obs_elbo contains what I want.

Originally I wanted to achieve this by using poutine.block as follows:

# Contribution of observed sample site / likelihood
blocked_model = pyro.poutine.block(model, expose=['obs'])
blocked_guide = pyro.poutine.block(guide) # log prob sum should equal 0
# The following should just be -1 * log prob sum of obs
obs_loss = svi.loss(blocked_model, blocked_guide, **params)

# Remaining contribution
blocked_model = pyro.poutine.block(model, hide=['obs'])
res_loss = svi.loss(blocked_model, guide, **params)

total_loss = res_loss + obs_loss

However this leads to very different values (by an order of magnitude). Sorry for not having a working example, but conceptually shouldn’t they be the same?

Thanks!

martinjankowiak · August 2, 2023, 2:55pm

are you averaging over many samples? loss will call the corresponding guide so the randomness in your two calls will be different

mochar · August 2, 2023, 2:59pm

Where should I be averaging? From what I gather the loss is the total log prob sum of the model subtracted by that of the guide. And yes I have taking the randomness into account (in my code I average over 100 particles)

martinjankowiak · August 2, 2023, 3:13pm

i don’t see what might be wrong in your approach but if you provide a complete code snippet i can take a closer look