I have designed a simple bayesian neural network with one hidden layer for a multi-class classification problem.
When I put a few neurons in the hidden layer (about 16), it works very well for very easy classification tasks.
When the classification problem gets harder and that it needs more neurons in the hidden layer, then the mean and the standard deviation of some neurons tend to infinity, leading to invalid values in the network output.
I could not see similar issues looking in the forum, does anyone has experienced this problem ? Do you have any tips ?
(I tried to constrain the parameters to some real interval. It actually prevents the training stage to crash but then the loss does not converge…)
Thank you very much.