How to handle unknown/missing/extra labels in HMM Emitter

For example, in an HMM Emitter, I have 2 labels, classA, classB. But some labels in my data are missing, I use the label MIS to indicate this situation. The indices of these three labels are 0,1,2 respectively.

Problem:
When I build emitter (from hidden state to observations of these labels), I only need a softmax which output the probability distribution of 2 labels, i.e. [classA, classB]. I don’t need the probability distribution [classA, classB, MIS]

But the problem is when I use the sample function, e.g.

pyro.sample(
dist.Categorical(distribution from softmax over classA and classB).mask(the mask that masked out the missing (MIS) labels, obs=observation)
)

The function seems cannot handle the extra class (MIS) well, just report an error (because in my observation I have some missing data, e.g. [0,1, 2], the 2 is not valid for the sample function). Thus I have to use the softmax output a distribution over 3 classes. However the last one (MIS) is not needed for me.

Is there a better way to deal with this situation? Maybe for now I could just assign an arbitrary value within 0 or 1 for missing data because anyway they will be masked out by the corresponding mask.

Many thanks for the nice reply to the similar question from fritzo in Unknown/missing/extra labels not be handled well in the sample function of a distribution · Issue #1823 · pyro-ppl/pyro · GitHub

Thanks for linking to the github issue!