Discrete enumeration in bilinear model

Hello,
I am a novice with PPL in general and Pyro in particular. I have a very simple model that I am trying to write in Pyro but can’t wrap my head around it.
For context, I have a set of 134 objects, 17 contexts and 17 evaluators.
I have surveyed the evaluators which answered yes/no/unknown to the questions “is object i suitable in context j?” and now I want to model the data and make some inferences.

The model I am considering is pretty straightforward: I am considering a hidden binary z_{ij} variable for each pair (\textrm{object}_i, \textrm{context}_j) which is the true answer to the question “is object i suitable in context j?”, and it is distributed as a \textrm{Bernoulli} variable with parameter p_{ij}. In addition, I have a parameter \alpha_k associated to each evaluator (measuring the expertise or the “adversarialness” of the evaluator) and a parameter \beta_j which models the “difficulty” of each context. I collected each answer in a list y_{ijk} and I am modeling the likelihood of y_{ijk} being 1 as p(y_{ijk}=1|z_{ij}) = \sigma(\alpha_k * \beta_j)^{z_{ij}} \times (1-\sigma(\alpha_k * \beta_j))^{(1-z_{ij})}, where \sigma is the sigmoid function.

This is how I implemented it:

@config_enumerate
def model(object, context, evaluator, y=None):
    n_samples, n_objects, n_contexts, n_evaluators = (
        len(object),
        object.max().item() + 1,
        context.max().item() + 1,
        evaluator.max().item() + 1,
    )
    expertise = pyro.param("expertise", torch.zeros(n_evaluators))
    difficulty = pyro.param("difficulty", torch.ones(n_contexts), constraint=constraints.positive)
    p = pyro.param("p", torch.ones(n_objects, n_contexts) * 0.5, constraint=constraints.unit_interval)
    with pyro.plate("object", size=n_objects, dim=-3):
        with pyro.plate("context", size=n_contexts, dim=-2):
            hidden_label = pyro.sample("hidden_label", dist.Bernoulli(p))
            with pyro.plate("data", size=n_samples) as k:
                e = expertise[evaluator[k]]
                diff = difficulty[context[k]]
                z = Vindex(hidden_label)[object[k], context[k]]
                sigma = torch.sigmoid(e * diff)
                pyro.sample("obs", dist.Bernoulli(probs=(sigma ** z) * ((1-sigma)**(1-z))), obs=y[k])

When I try to run it, I get the error “IndexError: index 100 is out of bounds for dimension 0 with size 2” on the line in which I assign the value to z. I understood that this is likely due to the enumeration taking place along the first dimension.

What would be the correct way of doing this?

You should use hidden_label as is without Vindex-ing (z = hidden_label), it has the shape of (2,1,1,1) (https://pyro.ai/examples/enumeration.html). The enumerated values of hidden_label ([0, 1] at dim=-4) will be marginalized out from your model. So if you use TraceEnum_ELBO it will calculate marginalized likelihood: sum(z_ij)[p(z_ij) * p(y_ijk | z_ij)].

thanks for taking the time to answer, but if I remove Vindex I still get the same error, and if I remove the indexing from hidden_label altogether I get this shape mismatch error:

Shape mismatch inside plate('data') at site obs dim -3, 16519 vs 2
    Trace Shapes:               
     Param Sites:               
        expertise             17
       difficulty             17
                p         134 17
    Sample Sites:               
      object dist              |
            value         134  |
     context dist              |
            value          17  |
hidden_label dist   134    17  |
            value 2   1     1  |
        data dist              |
            value       16519  |

I should specify that my data y is a long array (16k) of zeros and ones corresponding to the answers, and object, context and evaluator are also arrays of the same length containing the corresponding identities.

Judging from the shape of hidden_label being (2,1,1) you are using max_plate_nesting=2. You should set max_plate_nesting=3 in TraceEnum_ELBO since you have three plates (see the tutorial).

Then it should only be nested in the data plate.

I think it would help you if you first write down the factorization of your model. Plates correspond to (conditionally independent) products in your factorization, i.e. p(a) \prod_i p(b_i | a) would correspond to

a = pyro.sample("a", ...)
with pyro.plate("i_plate", ...):
    b = pyro.sample("b", ...). # dist depends on 'a'

Thank you for the suggestion, indeed it was helpful to go back and give a look at my complete data likelihood. So, the model is supposed to have this structure p(\mathbf{Y}, \mathbf{Z}) = \prod_{i,j}p_{ij}\prod_{k}p(y_{ijk}=1|z_{ij}), where p_{ij} is the probability of the hidden_label z_{ij} being 1. In words, the true latent answers to the question “is object i suitable in context j?” are independent of each other (for now, later I plan to tie all the answers for the same object i with another parameter), then the likelihood of an evaluator answering yes given the latent labels are conditionally independent.
So in order to accommodate for this structure, I rearranged the observations into a tensor of shape y.shape == (134 ,17, 17). Now the model looks like this:

@config_enumerate
def model(y=None):
    n_objects, n_contexts, n_evaluators = y.shape
    expertise = pyro.param("expertise", torch.zeros(n_evaluators))
    difficulty = pyro.param("difficulty", torch.ones(n_contexts), constraint=constraints.positive)
    p = pyro.param("p", torch.ones(n_objects, n_contexts) * 0.5, constraint=constraints.unit_interval)
    mask = ~y.isnan()
    with pyro.plate("object", size=n_objects, dim=-3):
        with pyro.plate("context", size=n_contexts, dim=-2):
            hidden_label = pyro.sample("hidden_label", dist.Bernoulli(p))
            with pyro.plate("evaluator", size=n_evaluators, dim=-1):
                sigma = torch.sigmoid(expertise * difficulty)
                with pyro.poutine.mask(mask=mask):
                    pyro.sample("obs", dist.Bernoulli(probs=(sigma ** hidden_label) * ((1-sigma)**(1-hidden_label))), obs=y)
[...]
svi = SVI(model, guide, optimizer, loss=TraceEnum_ELBO(max_plate_nesting=3))

I used a mask to condition on the observed data because there are missing values in the y tensor, and for now I just want to ignore them and just and let the model see only the observed data, I hope this is the correct way.

However I still get errors, different, but still:

ValueError: Shape mismatch inside plate('context') at site hidden_label dim -2, 17 vs 134
Trace Shapes:       
 Param Sites:       
    expertise     17
   difficulty     17
            p 134 17
Sample Sites:       
   object dist      |
         value 134  |
  context dist      |
         value  17  |
evaluator dist      |
         value  17  |

I don’t get why it messes the dimensions of the hidden_label.

That’s probably because the distribution dist.Bernoulli(p) of hidden_label has the batch shape of (n_objects, n_contexts) which correspond to dims (-2, -1) (counting from the right). Pyro plates are used to declare batch dimensions as independent. But they are mis-aligned in your code. For example, pyro.plate("context", size=n_contexts, dim=-2) says that dim=-2 is conditionally independent and has size=n_contexts. But as we saw above hidden_labels distribution has the size of n_objects at dim=-2.

You have to either reshape your distribution or fix plates so that their dims are aligned. There is a great tutorial on shapes in Pyro that explains all of this in more detail.

Thank you very much for your explanation! I spent some time (more than I would like to admit!) digging through the code and the tutorials. Starting from simpler models and building up, I managed to implement what I wanted. Even though I feel that this library still hasn’t completely “clicked” yet. But I guess it is a matter of using it more. Anyway, thank you for spending the time in helping me!