Hello,
I have been trying to use my Pyro neural network model to make predictions.
But I am keep scratching my head because I am getting ~1.40 for my training svi loss for my Pyro model, but the accuracy rate of the Pyro model is only ~26%. I am thinking this might has to do with the way I specfied my likelihood function for y
.
My Pyro neural network model performs a 4-class classification task, that is, for a given input, the model predicts which of the 4 classes (class 0, class 1, class 2, class 3) the input is likely to belong. To be more specific, my Pyro model predicts the classification probabilities for each of the 4 classes (given in the vector prediction_scores
), and once this is done, a user can select the class with the highest predicted classification probability to determine what the actual predicted class is. So, to make my original frequentist neural network to be Bayesian, I I used the Multinomial distribution as my likelihood function for y
. My actual code for this model class is shown below:
class MyModel(PyroModule):
def __init__(self, model):
super().__init__()
self.model = model
def forward(self, my_input, mc_labels = None, y = None):
# compute the vector `prediction_scores` .
# (classification probabilities before softmax)
prediction_scores = self.model(my_input)
# `softmax tensor` is a tensor of size 4,
# and the tensor stores the predicted probability that an observation will
# belong to the class 0, 1, 2, and 3.
# for example, if softmax_tensor = torch.tensor([0.1, 0.4, 0.3, 0.2]), then
# the model predicts that there is 0.1 chance for the observation to be
# classfied under the class 0,
# and the model predicts that there is 0.4 chance for the observation to be
# classfied under the class 1, etc.
softmax_tensor = nn.Softmax(dim=-1)(prediction_scores)
# `mc_labels` is equivalent to the actual correct class
# (i.e. the ``right'' answer)
#
# case 1: if `mc_labels` is given
if mc_labels != None:
# encode the `mc_label` in a form that is adaquate to
# use with the Multinomial function.
if mc_labels == torch.tensor([0]):
mc_label_tensor = torch.tensor([[1.,0.,0.,0.]])
elif mc_labels == torch.tensor([1]):
mc_label_tensor = torch.tensor([[0.,1.,0.,0.]])
elif mc_labels == torch.tensor([2]):
mc_label_tensor = torch.tensor([[0.,0.,1.,0.]])
elif mc_labels == torch.tensor([3]):
mc_label_tensor = torch.tensor([[0.,0.,0.,1.]])
# case 2: if `mc_labels` is not given
else:
mc_label_tensor = None
# `y` here stands for the predicted class type for the observation.
return pyro.sample("y",
dist.Multinomial(1, probs = softmax_tensor),
obs = mc_label_tensor)
I know that this is rather cumbersome, but could you tell me whether the way I assigned the likelihood function for y
(or the way I specfied the model class in general) is incorrect?
Thank you,
PS: I am thinking, maybe instead of doing return pyro.sample("y", dist.Multinomial(**1**, probs = softmax_tensor), obs = mc_label_tensor)
, I should do something like return pyro.sample("y", dist.Multinomial(**100**, probs = softmax_tensor), obs = mc_label_tensor)
, would this improve my accuracy rate greatly? :S Thank you again,