Batch cumputation of log_prob for different conditions

Mandelion · January 30, 2021, 12:31am

Hello, I Have an instance dist.ConditionalTransformedDistribution that I have conditioned on a batch of conditions, specifically : [batchx32] and for each of these conditions I have a list of points I would like to evaluate the log_prob for, for examples : [batchx2000x6] (2000 points of 6 features). Is there a way to do this in one go? The conditioning seems to work but then I get errors. If I try to only condition one at a time and then give [2000x6] it also works. If not, would looping over the batch dimensions be the best course of action? Thanks in advance.

martinjankowiak · January 30, 2021, 12:50am

hello,

unfortunately it’s hard to help with shape errors if a code snippet is not provided. please provide one so that forum members are more likely to be able to help you.

Mandelion · January 30, 2021, 10:10am

Yes my bad, just thought I might be missing something really obvious. Error is on the computation of the log_prob.

flow_dist = dist.ConditionalTransformedDistribution(base_dist, transformations)

n_epochs = 3000
early_stop_margin=0.01
optimizer = torch.optim.AdamW(parameters, lr=5e-3) 

early_stop = Early_stop(patience=patience,min_perc_improvement=torch.tensor(early_stop_margin))

for epoch in range(n_epochs):
   for batch in tqdm(dataloader):
        extract_0,enumeration_0,extract_1,enumeration_1 = batch
        encodings = Pointnet2(extract_0[:,3:],extract_0[:,:3],enumeration_0)
        loss = -flow_dist.condition(encodings).log_prob(extract_1).mean()

Error:

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
The size of tensor a (2000) must match the size of tensor b (3) at non-singleton dimension 1
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\torch\tensor.py", line 27, in wrapped
return f(*args, **kwargs)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\pyro\distributions\transforms\spline.py", line 34, in _searchsorted
values[..., None] >= sorted_sequence,
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\pyro\distributions\transforms\spline.py", line 146, in _monotonic_rational_spline
bin_idx = _searchsorted(cumheights + eps if inverse else cumwidths + eps, inputs).unsqueeze(-1)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\pyro\distributions\transforms\spline.py", line 313, in spline_op
y, log_detJ = _monotonic_rational_spline(x, w, h, d, l, bound=self.bound, **kwargs)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\pyro\distributions\transforms\spline.py", line 295, in _inverse
x, log_detJ = self.spline_op(y, inverse=True)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\torch\distributions\transforms.py", line 151, in _inv_call
x = self._inverse(y)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\torch\distributions\transforms.py", line 219, in __call__
return self._inv._inv_call(x)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\site-packages\torch\distributions\transformed_distribution.py", line 111, in log_prob
x = transform.inv(y)
  File "C:\code\flow_change\conditional_flow_compare.py", line 98, in <module>
loss = -flow_dist.condition(encodings).log_prob(extract_1).mean()
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
  File "C:\Users\samme\Anaconda3\envs\flow_change\Lib\runpy.py", line 194, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,

If I try this it works:

 for batch_ind in range(extract_1.shape[0]):
                encoding = encodings[batch_ind,:]
                extract_1_points = extract_1[batch_ind,...]
                loss += -flow_dist.condition(encoding).log_prob(extract_1_points).mean()

From what I can see I can only batch one point per condition or batch many points for one condition, would like to know if there is a way to go about this that is more efficient than the last snippet.

martinjankowiak · January 30, 2021, 6:49pm

can you please expand your code snippet to include important details? in particular:

full constructor for flow_dist
shapes of encodings, extract_1, …

Mandelion · January 31, 2021, 1:23am

input_dim = 6

base_dist = dist.Normal(torch.zeros(input_dim).to(device), torch.ones(input_dim).to(device))

count_bins = 16

context_dim= 32

patience = 50

n_layers = 20

Pointnet2 = Pointnet2(feature_dim=input_dim-3,out_dim=context_dim).to(device)

permutations = [torch.randperm(input_dim) for x in range(n_layers-1)]

class conditional_spline_flow:

    def __init__(self,input_dim,context_dim,permutations,count_bins,device):

        self.transformations = []

        self.parameters =[]

        

        for i in range(len(permutations)+1):

            hidden_dims = [128,128]

            spline = T.conditional_spline(input_dim,context_dim,hidden_dims=hidden_dims,count_bins=count_bins,bound=1.0)

            spline = spline.to(device)

            self.parameters += spline.parameters()

            self.transformations.append(spline)

            if i<len(permutations): #Not try to add to the end

                self.transformations.append(T.permute(input_dim,torch.LongTensor(permutations[i]).to(device),dim=-1))

    def save(self,path):

        torch.save(self,path)

conditional_flow_layers = conditional_spline_flow(input_dim,context_dim,permutations,count_bins,device)

parameters = conditional_flow_layers.parameters

transformations = conditional_flow_layers.transformations

flow_dist = dist.ConditionalTransformedDistribution(base_dist, transformations)

n_epochs = 3000

early_stop_margin=0.01

optimizer = torch.optim.AdamW(parameters, lr=5e-3) 

early_stop = Early_stop(patience=patience,min_perc_improvement=torch.tensor(early_stop_margin))

for epoch in range(n_epochs):

   for batch in tqdm(dataloader):

        extract_0,enumeration_0,extract_1,enumeration_1 = batch

        encodings = Pointnet2(extract_0[:,3:],extract_0[:,:3],enumeration_0)

        loss = -flow_dist.condition(encodings).log_prob(extract_1).mean()

Sizes:

encodings.shape

torch.Size([3, 32])

extract_1.shape

torch.Size([3, 2000, 6])

extract_0.shape

torch.Size([6000, 6])

That should be everything, let me know if you need anything else.

martinjankowiak · February 1, 2021, 4:37am

kinda hard to parse all that code but i’m guessing you just need some unsqueezing to make sure things broadcast correctly? maybe something like

flow_dist.condition(encodings.unsqueeze(-2)).log_prob(extract_1)

Mandelion · February 1, 2021, 12:20pm

That seems to do the trick, gives a ~3x performance increase too! Thanks for bearing with me.