Hi,
I’m experimenting with training a simple Bayesian feedforward network on the GPU. The model trains on the CPU just fine, but when I try training on the GPU, it gives an error about tensors being on more than one device.
I’ve been able to track down the cause of the error being the PyroSample statements that replace the nn.Linear layer’s parameters with Pyro parameters. When I comment out these statements, the model runs fine on both the CPU and GPU. But when I include the statements, the model runs into issues because I believe the self.cuda() I’m using is not putting the Pyro parameters on the GPU.
So, my question is, what is the best way for putting the PyroSample weight/bias parameters (which override the PyTorch parameters of the Linear layer) on the GPU? I pasted my simple code below. It seems like there is an easy solution but my self.cuda() doesn’t seem to work.
# Put data on GPU.
X, y = X.cuda(), y.cuda()
# Specify model.
class TestNN(PyroModule):
def __init__(self, in_features, out_features=1):
super().__init__()
self.fc1 = PyroModule[nn.Linear](in_features, out_features)
# Replace network layer parameters with Pyro priors.
self.fc1.weight = PyroSample(dist.Normal(0., 1.).expand([out_features, in_features]).to_event(2)) # -->>> how to put these parameters on the GPU??
self.fc1.bias = PyroSample(dist.Normal(0., 10.).expand([out_features]).to_event(1)) # -->>> how to put these parameters on the GPU??
self.cuda()
def forward(self, x, y=None):
p = torch.sigmoid(self.fc1(x))
# Likelihood.
with pyro.plate('data', x.shape[0]):
obs = pyro.sample('obs', dist.Bernoulli(p), obs=y)
return p
# The below also doesn't seem to put the parameters on the GPU.
test_nn = TestNN(in_features=X.shape[1])
test_nn.cuda() # doesn't work??