Hi, I’m working on a model that includes a non-reparametrizable distribution and so I’m trying different strategies to reduce variance and improve inference.
Currently I have tried two baselines (avg_decaying and a neural baseline using lstm).
I would like to ask if it is possible to use in Pyro relaxation-based control variates like REBAR and RELAX.
Thank you very much for any help.