Thanks, @fritzo! I implement the code like this:
loc_initial = torch.randn(param.size(), dtype=torch.float32)
loc_learn = pyro.param('guide_'+name, loc_initial)
assignment = \
assignment = assignment.float()
assignment = torch.mul(loc_learn, assignment)
It can work. But the elbo loss is quite large, about 4million at initial stage. It can be downgrade as program runing. Is it normal that the loss is so large? My model has about 0.35 million learning parameters, is ti possible the divergence between priors and posteriors induce the large loss?