I’m trying to work with tutorial proLDA.
It works well with my own data.
Currently I have some knowledge about beta. For example I know some topics with some popular works in these topics.
topic1: {Investment, Loan, Mortgage, Financial, Services, …}
topic2: {athletics, arena, beat, award, captain, …}
…
Could you please guide me how to encode these to the model
It seem change to semi-supervised to get beta here but don’t know how to do this with Pyro.
class Decoder(nn.Module):
# Base class for the decoder net, used in the model
def __init__(self, vocab_size, num_topics, dropout):
super().__init__()
self.beta = nn.Linear(num_topics, vocab_size, bias=False)
# I want to add some knowledge to beta
self.bn = nn.BatchNorm1d(vocab_size, affine=False)
self.drop = nn.Dropout(dropout)
def forward(self, inputs):
inputs = self.drop(inputs)
# the output is σ(βθ)
return F.softmax(self.bn(self.beta(inputs)), dim=1)
I have read the original paper again and seen that β is unconstrained, they just define it as matrix of weight of each topic (topic-word probability). Thus, the tutorial define it as Linear NN.
Moreover, our model try to decode observed word distribution from θ and β.
You gave me to method to sample the prior topic (β), we also have Dirichlet prior θ.
But the problem here is how to incorporate these 2 distributions in Decoder Network.
I’m sorry because I have just start to studied and worked with Bayesian learning in a short time