Dear Pyro Users
I am trying to understand how Variational Sparse GP is implemented in Pyro GP. For me to better ask question, I recap the main framework of VSGP here. The ELBO to be maximized in this case takes the following form
sELBO = E_{q(f)}[log p(y f)]  KL[q(u)  p(u)]
where q(u) is the approximated posterior and q(f) = int p(fu)q(u)du where p(fu) is GP conditional. This form of ELBO is not in the general form of
gELBO = E_{q(z)}[log p(yz)]  KL[q(z)p(z)]
where z is the latent variable. In the above GP case, z = {f, u}.
My understanding is Trace_ELBO calculates gELBO with latent z to be defined in both model and guide. Clearly the sELBO for VSGP is not the form of gELBO.
What confuses me is that how the Variational Sparse GP regression model is defined, see
http://docs.pyro.ai/en/stable/_modules/pyro/contrib/gp/models/sgpr.html,

According to this definition, we see the only latent variable is u. Hence when this model is combined with the loss TRACE_ELBO in svi, it is doing something like gELBO with z = u. Then where is the latent variable f?

As it is not clear how pyro calculates ELBO (too complicated to trace it out), not sure whether the above
VariationalSparseGP
model is doing the right thing. 
Also really dont know what self._load_pyro_samples() is about??
Any ideas and advices appreciated
Thanks
J