For example, if the observation in the dataset is words, i.e. the data is articles, documents. The observation state size would be those words. But the vocabulary size is very large.
If I just use the Trace_ELBO() for training, would the performance will be stable and good enough? And will the learning be stable?