Does my parameter make sense?


Does the y=None parameter in the forward function of the MyModel class make sense in my case? Please see below for my full model definition. After training my Pyro model, the validiation error is significantly reduced but for some reason the success rate on the test set is not increasing accordingly. I am not sure why this is happening.

Thank you,

# define likelihood function for our Bayesian layer.
class MyModel(PyroModule):
    def __init__(self, model):
        self.model = model

    def forward(self, input_ids, attention_mask,
                token_type_ids, mc_labels = None, y = None):
        # retrieve prediction_scores (`y`).
        if mc_labels != None:
            prediction_scores = self.model(input_ids=input_ids,
                                       token_type_ids = token_type_ids,
            softmax_tensor = nn.Softmax(dim=-1)(prediction_scores)
            # encode the mc_label in a form that is adaquate to
            # use with the Multinomial function.
            if mc_labels == torch.tensor([0]):
                 mc_label_tensor = torch.tensor([[1.,0.,0.,0.]])

            elif mc_labels == torch.tensor([1]):
                 mc_label_tensor = torch.tensor([[0.,1.,0.,0.]])

            elif mc_labels == torch.tensor([2]):
                 mc_label_tensor = torch.tensor([[0.,0.,1.,0.]])

            elif mc_labels == torch.tensor([3]):
                 mc_label_tensor = torch.tensor([[0.,0.,0.,1.]])
            prediction_scores = self.model(input_ids=input_ids,
                                    token_type_ids = token_type_ids)[1]
            softmax_tensor = nn.Softmax(dim=-1)(prediction_scores)
            mc_label_tensor = None
        # for each question, we choose the `y` from 4 classes (mc options).
        # Hence, the multinomial distribution with total_size =1 and
        # prob = nn.softmax(prediction_scores) is adequate for our likelihood.
        return pyro.sample("y",
                    dist.Multinomial(1, probs = softmax_tensor),
                    obs = mc_label_tensor)

If you set mc_labels=None, you will be able to return samples at sample site "y". I don’t see where the input y plays in your model.