ELBO loss on validation set

yoshy · March 1, 2022, 7:59pm

I’d like to calculate ELBO on my validation set and have a few questions. First, does that make sense to do so? What other metrics can be used to evaluate the posterior other than point metrics like MAE calculated from samples?

Next, am I doing this correctly? elbo_loss was taken from here

def elbo_loss(model, guide, inputs):
  guide_trace = poutine.trace(guide).get_trace(inputs)
  model_trace = poutine.trace(poutine.replay(model, trace=guide_trace)).get_trace(inputs)
  return -(model_trace.log_prob_sum() - guide_trace.log_prob_sum())

rs = ["obs", "_RETURN"]
predictive = Predictive(model=model, guide=guide, num_samples = 500, return_sites = rs, parallel = True)
with converter_val.make_torch_dataloader(batch_size=df_val.count()) as val_dataloader:
  val_dataloader_iter = iter(val_dataloader)
  pd_batch = next(val_dataloader_iter)
  pd_batch['features'] = torch.transpose(torch.stack([pd_batch[x] for x in x_feat]), 0, 1)  
  inputs = pd_batch['features'].to(device)
  labels = pd_batch[y_name].to(device)
  print(elbo_loss(model, guide, inputs))

Lastly, I intend to divide elbo_loss by the number of observations so it can be comparable between different batch sizes.

martinjankowiak · March 1, 2022, 8:37pm

can you give more details about the model? does it have global latent variables? local latent variables? both? etc

yoshy · March 1, 2022, 8:37pm

Actually I think I found the proper usage:

test_loss = Trace_ELBO()
test_loss.loss(model, guide, inputs, labels)

This, of course, makes much sense since the labels are being used in this function.

yoshy · March 1, 2022, 8:38pm

Yes, sorry I forgot to link my previous question: AutoDiagonalNormal found no latent variables; Use an empty guide instead

My model code is included there.

martinjankowiak · March 1, 2022, 8:48pm

yes something like that should work. although evaluate_loss should be faster: SVI — Pyro documentation

also you need to care if you’re doing data subsampling or the like

yoshy · March 1, 2022, 9:18pm

Awesome thanks. No subsampling at the moment.

If I’m following the documentation correctly, the default value for num_particles for all ELBO losses is 1. I.e., 1 sample is drawn for each observation to be used in the ELBO loss calculation. Is it recommended to increase that?

martinjankowiak · March 1, 2022, 9:29pm

certainly if you want accurate ELBO estimates. you can create one SVI object for training and another for evaluation, where the latter has significantly more particles