What about this to deal with complex model structure (like CNN):
- Start with standard NN optimization, by minimizing mse with a standard pytorch optimizer, to reach a reasonable local minimum.
- Transfer weights to a Bayseian Network and finish learning with pyro SVI steps.
Has someone tried to implement this kinf of process ?