Hi all, I’m trying to train deep GP with multi-gpu by model parallelism. That is, I put each GP layer in one GPU and manually move the output of each layer to the next GPU. While this strategy works for neural networks, in DGP, I got into the issue:
In trace_mean_field_elbo.py line 113, in _differentiable_loss_particle
elbo_particle = elbo_particle - kl_qp_sum
The tensors are in different GPUs.
Is there a good way to resolve this issue? Thanks!