Example: Mixing Optimizers

geoffwoollard · May 17, 2022, 5:24am

I’m confused about the Example: Mixing Optimizers docs
https://pyro.ai/examples/custom_objectives.html#Example:-Mixing-Optimizers

In

adam = torch.optim.Adam(adam_parameters, {"lr": 0.001, "betas": (0.90, 0.999)})
sgd = torch.optim.SGD(sgd_parameters, {"lr": 0.0001})
loss_fn = pyro.infer.Trace_ELBO().differentiable_loss
# compute loss
loss = loss_fn(model, guide)
loss.backward()
# take a step and zero the parameter gradients
adam.step()
sgd.step()
adam.zero_grad()
sgd.zero_grad()

What are adam_parameters and sgd_parameters? The string labels given to pyro.param? A list of them?

Also, where do the arguments for model/guide go?

I have been using pyro.infer.svi.SVI.step(data,dictionary), where we model/guide are called as model/guide(data,dictionary), but now want to use different step sizes and a custom learning schedule, and am wondering where to pass data and dictionary.

geoffwoollard · May 17, 2022, 5:40am

I read some docs torch.optim — PyTorch 1.13 documentation and got it

adam = torch.optim.Adam(pytorch_model.parameters(), lr=1e-3)
loss_fn = pyro.infer.Trace_ELBO().differentiable_loss
loss = loss_fn(model, guide, data, args_dict)
loss.backward()
adam.step()
adam.zero_grad()

Where pytorch_model is a deep net that inherits from the torch.nn.Module class, and that I did pyro.module("pytorch_model", pytorch_model).

I think the these lines of the pyro docs do not match with what is on the pytorch docs

adam = torch.optim.Adam(adam_parameters, {"lr": 0.001, "betas": (0.90, 0.999)})
sgd = torch.optim.SGD(sgd_parameters, {"lr": 0.0001})

ordabayev · May 24, 2022, 9:45pm

Hi @geoffwoollard

A list (iterable) of pyro.param string labels

They are passed to the loss function as *args, **kwargs:

loss = loss_fn(model, guide, *args, **kwargs)