Is there anyway to decrease a tensor size for my regression model?

Hello,

I’m getting a memory error for my data. It is torch.Size([657878, 82]) which equals 431567968 when running the command x_discrete.element_size() * x_discrete.nelement().

This is a tensor of 1’s and 0’s.

Is there any way to decrease this or break it up? Can one batch data into a bayesian model?

Hi @jordan.howell2, could you provide a bit more detail, say a simplified version of your model?

In general you should indeed be able to subsample data when using Pyro’s SVI. One approach I’ve used is to store a compressed representation of data in memory, then subsample minibatches in compressed form, then decompress each minibatch inside the training loop. That “decompression” might be the last step of preprocessing like scattering a sparse tensor into a {0,1} array, or might convert a torch.bool array to a torch.float array, or might send data from CPU to GPU.

Here is the model:

class BayesianRegression(PyroModule):
    def __init__(self, in_features, out_features):
        super().__init__()
        self.linear = PyroModule[nn.Linear](in_features, out_features)
        self.linpiear.weight = PyroSample(dist.StudentT(5, 0., 1.).expand([out_features, in_features]).to_event(2))
        self.linear.bias = PyroSample(dist.Normal(0., 10.).expand([out_features]).to_event(1))

    def forward(self, x, y=None):
        sigma = pyro.sample("sigma", dist.Uniform(0., 10.))
        mean = self.linear(x).squeeze(-1)
        with pyro.plate("data", x.shape[0]):
            rate = self.linear(x).squeeze(-1).exp()
            obs = pyro.sample("obs", dist.Poisson(rate), obs=y)
        return mean

nuts_kernel = NUTS(model)
mcmc = MCMC(nuts_kernel, num_samples = 1000, warmup_steps = 200)
mcmc.run(x_discrete, y_discrete)

Here it is using svi:

num_iterations = 1000
pyro.clear_param_store()
for j in range(num_iterations):
    # calculate the loss and take a gradient step
    loss = svi.step(x_discrete, y_discrete)
    if j % 100 == 0:
        print("[iteration %04d] loss: %.4f" % (j + 1, loss / len(x_discrete)))

Here’s a slight change to the model to allow subsampling:

class BayesianRegression(PyroModule):
    def __init__(self, in_features, out_features):
        super().__init__()
        self.linear = PyroModule[nn.Linear](in_features, out_features)
        self.linear.weight = PyroSample(dist.StudentT(5, 0., 1.).expand([out_features, in_features]).to_event(2))
        self.linear.bias = PyroSample(dist.Normal(0., 10.).expand([out_features]).to_event(1))

    def forward(self, full_size, x, y=None):  # <--- pass full_size to the model.
        sigma = pyro.sample("sigma", dist.Uniform(0., 10.))
        mean = self.linear(x).squeeze(-1)
        with pyro.plate("data", full_size, subsample=x):  # <--- note the change.
            rate = self.linear(x).squeeze(-1).exp()
            pyro.sample("obs", dist.Poisson(rate), obs=y)
        return mean

Now in your SVI training loop you can subsample. Here’s an example where you might store x_discrete and y_discrete in uint8 arrays and convert to float (4x bigger) only inside the training loop:

full_size = len(x_discrete)
batch_size = 1000
for step in range(1000):
    batch = torch.randperm(full_size)[:batch_size]
    # Subsample and decompress.
    x_batch = x_discrete[batch].float()
    y_batch = y_discrete[batch].float()
    loss = svi.step(full_size, x_batch, y_batch)
    if j % 100 == 0:
        print("[iteration %04d] loss: %.4f" % (step + 1, loss / full_size))

You could probably save even more memory by moving more preprocessing steps into the training loop.

1 Like

Thank you. It ran. Should I add the full_size into the Predictive as well? It doesn’t seem to work.

When I run

predictive = Predictive(model, guide=guide, num_samples=1000,
                        return_sites=("linear.weight", "obs", "_RETURN"))
samples = predictive(x_discrete)

I get the following:

Traceback (most recent call last):

  File "<ipython-input-156-c6d7a45b4038>", line 1, in <module>
    samples = predictive(x_discrete)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\infer\predictive.py", line 201, in forward
    parallel=self.parallel, model_args=args, model_kwargs=kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\infer\predictive.py", line 53, in _predictive
    max_plate_nesting = _guess_max_plate_nesting(model, model_args, model_kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\infer\predictive.py", line 21, in _guess_max_plate_nesting
    model_trace = poutine.trace(model).get_trace(*args, **kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\poutine\trace_messenger.py", line 187, in get_trace
    self(*args, **kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\poutine\trace_messenger.py", line 165, in __call__
    ret = self.fn(*args, **kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\pyro\nn\module.py", line 413, in __call__
    return super().__call__(*args, **kwargs)

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\torch_env\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)

TypeError: forward() missing 1 required positional argument: 'x'

I thought I might need the full_size in the predictive but that gives me:


RuntimeError: Expected object of scalar type Float but got scalar type Double for argument #2 'mat1' in call to _th_addmm
     Trace Shapes:       
      Param Sites:       
     Sample Sites:       
        sigma dist |     
             value |     
linear.weight dist | 1 82
             value | 1 82
  linear.bias dist | 1   
             value | 1   

Yes, SVI.step() and Prective.__call__() take the same arguments. Looks like your error is elsewhere. Maybe you missed a .float() conversion?

1 Like

All worked. Thank you for the help.