Error when carrying out mixing optimizer objective for training Pyro model

I want to apply different learning rates to the different groups of parameters in my Pyro neural network model. I was trying to follow this example from Pyro documentation:

adam = torch.optim.Adam(adam_parameters, {"lr": 0.001, "betas": (0.90, 0.999)})
sgd = torch.optim.SGD(sgd_parameters, {"lr": 0.0001})
loss_fn = pyro.infer.Trace_ELBO().differentiable_loss
# compute loss
loss = loss_fn(model, guide)
# take a step and zero the parameter gradients

However, when I do:

>>>  <generator object Module.parameters at 0x7ff2f8b82850>


I get the following error:

    if not 0.0 <= lr:

TypeError: '<=' not supported between instances of 'float' and 'dict'

What am I doing wrong here, and how can I fix this error?

Thank you,

it looks like you’re misusing the torch.optim.Adam API. please check the pytorch docs

Following the Pytorch documentation (,

I tried torch.optim.Adam(model.model.multiple_choice_head.parameters(), lr=0.001), but it gives me another error:

Traceback (most recent call last):

  File "<ipython-input-20-1c381729206f>", line 1, in <module>
    optimizer_3 = torch.optim.Adam(model.model.multiple_choice_head.parameters(), lr=0.001)

  File "/Users/hyunjindominiquecho/opt/anaconda3/lib/python3.7/site-packages/torch/optim/", line 44, in __init__
    super(Adam, self).__init__(params, defaults)

  File "/Users/hyunjindominiquecho/opt/anaconda3/lib/python3.7/site-packages/torch/optim/", line 46, in __init__
    raise ValueError("optimizer got an empty parameter list")

ValueError: optimizer got an empty parameter list

I am thinking PyTorch optim is complaining that there is no parameter in the list, because I first converted my multiple_choice_head of the Pytorch model into a Pyro Bayesian network, and then tried to execute torch.optim.Adam.

I only converted the multiple_choice_head portion of my PyTorch model into a Pyro Bayesian network, and left the rest parts of the same PyTorch model in its original Frequentist form.

I want to apply a higher learning rate for the multiple_choice_head (the part that I converted to Pyro model), while applying a small learning rate for all remaining frequentist part.

How should I tweak the code shown in Pyro documentation to achieve this?

Thank you,