Doubts regarding Tensor slicing within model and guide

Following, is a simple example of what i am trying to achieve. The model takes in a label (forget x for now) and the shape is (_, 7) where the first three indices belong to a label type 1 (action) and the last 4 indices belong to a label type2 (reaction). Both together form the labels as they are observed.

def model(self, x,y):
# sample y looks like 0 0 1 0 0 1 0 
 # The first three index covers action and the next four index covers reaction.
        # register PyTorch module `decoder` with Pyro
        options = dict(dtype=x.dtype, device=x.device)
        with pyro.plate("data", x.shape[0]):
            # Using a pre-defined cpt.
            action_type = pyro.sample("action_type", dist.Categorical(cpts["action_type"]))
            reaction_type = pyro.sample("reaction_type", dist.Categorical(cpts["reaction_type"]))
          # Slicing the labels 
            action= pyro.sample("action", dist.Categorical(cpts["action"][action_type]), obs= y.squeeze(0)[:3]) 
            reaction = pyro.sample("reaction", dist.Categorical(cpts["character"]), obs=y.squeeze(0)[3:])


def guide(self, x, y):
        # register PyTorch module `encoder` with Pyro
        with pyro.plate("data", x.shape[0]): # Iterate every batch
            print(f"size of y is {y.shape}") # (batchsize, 7)
           # sample y looks like 0 0 1 0 0 1 0 
          # The first three index covers action and the next four index covers reaction.
            action = (y.squeeze(0)[:3])]!=0).nonzero().squeeze(1)[0]  # Trying to get the index where there was a non -zero entry. 
            #If 0 0 1 is the label then need action as 2 
            reaction = (y.squeeze(0)[3:]!=0).nonzero().squeeze(1)[0]
            action_type = pyro.sample("action_type", dist.Categorical(inverse_cpts["action_type"][action]))
            reaction_type = pyro.sample("reaction_type", dist.Categorical(inverse_cpts["reaction_type"][reaction]))
            

I am not able to slice the y tensor according to my liking . Whenever i try to slice y i am getting the whole batch tensor whereas the inverse_cpts needs a scalar tensor for each observation.
How do i slice the tensor correctly or is there any other way to do this ?

@martinjankowiak @fritzo @jpchen @eb8680_2 @eb8680

If y.shape == (batch_shape, 7) and you want to slice along dimension 2, you need to add an ellipsis to your slicing expression to indicate that you want the rightmost dimension: action, reaction = y[..., :3], y[..., 3:]. All of the .squeeze() calls in your code have no effect and should be removed.

As background, the NumPy advanced indexing tutorial is a good reference for understanding the rules and principles of tensor indexing, and applies equally to PyTorch.

I guess, i didn’t explain my doubt well.
action = y[…, :3] would give me the first 3 columns. ( shape would be 32 X 3)

action_index = action??? 
# original implementation
(action !=0).nonzero().squeeze(1)[0]
  # If action is 0 0 1  for the first sample, then i expect action_index = 2 .
  action_type = pyro.sample("action_type", dist.Categorical(inverse_cpts["action_type"][action_index]))

In the above statement i am expecting the action to be a scalar for each sample in batch. How do i go from one hot encoded labels to getting the actual index of the label so that i can sample from the correct cpt ? Also, inside the sample statements, the tensor would be iterated accordingly right ?

I see. It looks like (edit: torch.nonzero(y[..., :3])[:, 1] - note the ellipsis) will do what you want.

Why are action and reaction encoded in this way (into y) in the first place, instead of just being encoded as two tensors of integers? Are you sure the model you wrote is going to behave the way you expect? In particular, the action and reaction sample sites in model expect integer observations, not one-hot vectors (though by changing their distributions from Categorical to OneHotCategorical you could continue working with the current encoding)

So, the action and reaction are concatenated this way because i feed it to a CNN network.

I made a mistake in this example. Action and reaction are one hot categorical in my actual code.

torch.nonzero(y[:3])[:, 1]

is computing what i want for the entire batch but its changing the dimension of action type from a scalar to (batch_size, 1) in guide and there is a dimension mismatch error in the sites for action_type .

summary of what i am trying to achieve:

  1. In model action_type causes action. I observe action and its a one hot categorical distribution.
  2. In guide, i need to sample action_type and basically have a inverse_cpt where given action, i can get the probability of an action type. This is where i get stuck.

Edit: Oops, I meant torch.nonzero(y[..., :3])[:, 1] - note the ellipsis. Does that work?

The following code works fine:

batch_size, num_actions, num_action_types = 3, 2, 4
y = torch.tensor([[0, 1, 0], [1, 0, 1], [1, 0, 1]])
assert y.shape == (batch_size, num_actions + 1)
a = y[..., :2].nonzero()[:, 1]
assert a.shape == (batch_size,)
inverse_cpt = torch.rand(num_actions, num_action_types)
assert inverse_cpt[a].shape == (batch_size, num_action_types)
d = dist.Categorical(probs=inverse_cpt[a])
assert d.batch_shape == (batch_size,)

What is the shape of inverse_cpts["action_type"]? If this doesn’t work, can you provide a complete, runnable example that reproduces your errors?

I am getting a dimension mismatch in sites… model and guide aren’t producing the same shape for action type.

My data resides in a drive and i am not sure how best to share the code and data with you.

You don’t need to share your code or data, but debugging subtle tensor shape errors can be difficult without tracebacks and a minimal runnable example that reproduces your errors on fake data in a simplified/anonymized model.

Otherwise, beyond the snippet I wrote above that should be correct for the indexing logic in your guide, I can only offer generic advice (set pyro.enable_validation(True), index from the right, add shape assertions before/after every sample statement, use pyro.ops.indexing.Vindex) and point you to the tensor shape tutorial.

def model(x,y):
     with pyro.plate("data", x.shape[0]):
        action_type = pyro.sample("action_type", dist.Categorical(cpts["action_type"])

def guide(x, y):
      with pyro.plate("data", x.shape[0]):
          action = torch.nonzero(y[..., :3])[:, 1] # Produces (batch_size, ) tensor
          action_type = pyro.sample("action_type", dist.Categorical(inverse_cpts["action_type"][action])

In the model, does the action_type produce a (batch_size, ) tensor as it is enclosed within the plate ?
In guide, action itself is a (batch_size, ) tensor and when i substitute it back into the pyro.sample statement for action_type, i expect the action_type there to be (batch_size, ) . Is my view consistent ?

Before sending the tracebacks i’ ll ensure from my side that i am trying to do the right thing.

Thanks for taking the time out.

Is my view consistent ?

Yes.

Before sending the tracebacks i’ ll ensure from my side that i am trying to do the right thing.

As a general tip, the more information you can provide when you’re asking for help or reporting a bug, the more useful we (or other open-source communities) can be. It’s almost always better to provide a runnable example and a full traceback in your initial requests for support even if it seems too verbose :slight_smile: .

Figured out the issue. I thought the following syntax were consistent for both scalar and a vector

var[idx1][idx2] -> Here idx1 and idx2 are scalar
var[idx1][idx2] -> Here idx1 and idx2 are vectors. 

Above statements produce different shapes. var[idx1, idx2] seems to be consistent.