Hi, I’m new to all things Bayesian and I’ve been reading through the SVI tutorials. I had a few questions regarding gradient estimators:

It says that removing nondownstream variables is raoblackwellization. Doesnt the raoblackwell thm state that any estimator conditioned on a sufficient statistic equal or better than nonconditional estimator? How does removing/integrating variables out correspond to this? I cant connect these two concepts in my head.

In the part about the score function estimators, if
q
is nonreparameterizable won’t takinggrad(log(q(z)))
be equally problematic, since you can’t differentiateq(z)
with respect to its parameters? in the last equation in this section, you still differentiatef
wrtphi
but I thought the point was thatf
was non differentiable.
Thanks for your help.