Hey all,
I’m trying to add mean squared error loss to my objective for a variational autoencoder, so I’m following the custom objectives tutorial, and I have code that is a similar and otherwise functioning version of the vae tutorial. Specifically I’m trying to implement the section, a lower level pattern:
# define optimizer and loss function
optimizer = torch.optim.Adam(my_parameters, {"lr": 0.001, "betas": (0.90, 0.999)})
loss_fn = pyro.infer.Trace_ELBO.differentiable_loss
# compute loss
loss = loss_fn(model, guide)
loss.backward()
# take a step and zero the parameter gradients
optimizer.step()
optimizer.zero_grad()
I’d like some more detail on how to actually implement this in practice. This is what I have tried:
optimizer = torch.optim.Adam(vae.parameters(), lr=1e-3)
elbo_loss_fn = pyro.infer.Trace_ELBO.differentiable_loss
for epoch in range(1000):
epoch_loss = 0.
for x, _ in train_dl:
x = x.cuda()
loss = elbo_loss_fn(model=vae.model, guide=vae.guide)
loss.backward()
optimizer.step()
optimizer.zero_grad()
Which gives me the error “TypeError: differentiable_loss() missing 1 required positional argument: ‘self’”. And never looks at the data x. I figure if I can get the elbo loss to work, I can just take the MSE between x and its reconstruction and add it to the elbo loss before finding the gradients with loss.backward().
Thanks for taking the time to reply to my post, I really appreciate it.
Adding parenthesis fixes that error, but the code still doesn’t run as in the tutorial. It returns the following error:
TypeError: guide() missing 1 required positional argument: ‘x’
I’m not really sure what x refers to because if I put data from my dataloader into it it returns:
TypeError: ‘Tensor’ object is not callable
Same for model.
Thank you for your time.
I think that your model/guide requires an argument x. In that case, you need to call
loss_fn(model, guide, x)
Sorry that the tutorial missed these important points. It might be better to look at the documentation first. If the above fix works well for you, I’ll update the tutorial to address these issues. Thanks!
Thanks for your continued help.
Using just mse_loss or just elbo_loss results in the same error.
Based on what you said I thought of some things to test out and I’ve found the source of the problem! I have a loss+= loss line in my model testing code that I run every epoch. The code is as follows, it comes just after my training code.
# initialize loss accumulator
test_loss = 0.
# compute the loss over the entire test set
for i, (x, _) in enumerate(test_dl):
x = x.cuda()
elbo_loss = elbo_loss_fn(vae.model, vae.guide, x)
mse_loss = F.mse_loss(x, vae.reconstruct_img(x))
temp_loss = elbo_loss + mse_loss
test_loss += temp_loss
# report test diagnostics
normalizer_test = len(test_dl.dataset)
total_epoch_loss_test = test_loss / normalizer_test
test_elbo.append(total_epoch_loss_test)
torch.cuda.empty_cache()
gc.collect()
Removing the + in test_loss += temp_loss solves the problem.
If it matters I define my data loader as follows (batch_size=32):
You’re welcome! I believe you can resolve this issue by adding the context with torch.no_grad(): for your testing code. Another way is to call test_loss += temp_lost.detach(). For testing, you don’t want pytorch remember the computation graph of all iterations (which is caused by test_loss += temp_loss).
Hi there!
I am struggling to implement the custom objective tutorial (some random error with retain_graph), but in fact I don’t need a custom SVI: all I want is to store the gradient after each SVI step and implement a stopping criterion that way. Is there maybe an easier way to proceed?
Thanks
Guillaume
can’t really help you without further details. probably you want to use something like pyro.infer.Trace_ELBO.differentiable_loss as is done in the tutorial
def step(self, *args, **kwargs):
"""
:returns: estimate of the loss
:rtype: float
Take a gradient step on the loss function (and any auxiliary loss functions
generated under the hood by `loss_and_grads`).
Any args or kwargs are passed to the model and guide
"""
# get loss and compute gradients
with poutine.trace(param_only=True) as param_capture:
loss = self.loss_and_grads(self.model, self.guide, *args, **kwargs)
params = set(site["value"].unconstrained()
for site in param_capture.trace.nodes.values())
# actually perform gradient steps
# torch.optim objects gets instantiated for any params that haven't been seen yet
self.optim(params)
# zero gradients
pyro.infer.util.zero_grads(params)
return torch_item(loss)
The above is the step function for svi. You can see how it gets the params. If you want to get the gradient, you can add some actions behind self.optim(params) before zero_grads.