Using poutine.block vs calculating loss manually leads to different values

I wish to decompose the loss in contributions of the observed data + the remaining sample sites. That means I need to calculated -1*log_obs_sum of the observed sample site. What works is calculating the loss manually in the train loop:

with torch.no_grad(): 
    model_trace, guide_trace = elbo._get_trace(model, guide, [], params)
    total_elbo = model_trace.log_prob_sum().item() - guide_trace.log_prob_sum().item()
    obs_elbo = model_trace.nodes['obs']['log_prob']
    res_elbo = total_elbo - obs_elbo.sum().item()

Now obs_elbo contains what I want.

Originally I wanted to achieve this by using poutine.block as follows:

# Contribution of observed sample site / likelihood
blocked_model = pyro.poutine.block(model, expose=['obs'])
blocked_guide = pyro.poutine.block(guide) # log prob sum should equal 0
# The following should just be -1 * log prob sum of obs
obs_loss = svi.loss(blocked_model, blocked_guide, **params)

# Remaining contribution
blocked_model = pyro.poutine.block(model, hide=['obs'])
res_loss = svi.loss(blocked_model, guide, **params)

total_loss = res_loss + obs_loss  

However this leads to very different values (by an order of magnitude). Sorry for not having a working example, but conceptually shouldn’t they be the same?


are you averaging over many samples? loss will call the corresponding guide so the randomness in your two calls will be different

Where should I be averaging? From what I gather the loss is the total log prob sum of the model subtracted by that of the guide. And yes I have taking the randomness into account (in my code I average over 100 particles)

i don’t see what might be wrong in your approach but if you provide a complete code snippet i can take a closer look