Subsample and gpu and autoguides?

Hello, I was wondering what is the right protocol for using gpu and subsampling. Surprisingly, I didnt find a example on it and I couldnt find a good solution on it. This is my example code

def model(A, sc_obs=None, st_obs = None):

with pyro.plate("n_celltypes", A.shape[1]):

    # u is celltype by genes
    u = pyro.sample("u", dist.Normal(sc_obs.new_zeros(sc_obs.shape[1]), sc_obs.new_ones(sc_obs.shape[1])).to_event(1)).to('cpu')
                                                                                                            
    # print(u.shape)

sc_ls = sc_obs.sum(axis = 1, keepdims = True)
# sc_lsp = pyro.param("sc_lsp", x.new_ones(  ))
with pyro.plate("n_sc", sc_obs.shape[0], subsample_size = 1024) as sc_ind:
    sc_mean = F.softmax(A.index_select(0, sc_ind).to('cpu') @ u, dim = 1)
    pyro.sample("sc_obs", dist.Poisson(sc_ls.index_select(0, sc_ind).to('cpu') *  sc_mean).to_event(1), obs = sc_obs.index_select(0, sc_ind).to('cpu'))

guide = AutoNormal(model, init_loc_fn=init_to_value(values = {“u”:res.to(‘cpu’)}))

To use gpu, I change all the cpu to ‘cuda’. However, the guide does not know that it should use gpu. I have tried to move the guide to gpu, but it doesnt work. Is my only option to write a custom dataloader to subsample?