I’m a PhD student studying the properties of black-box variational inference (called SVI in Pyro).
I’ve noted the DCT and Haar reparmeterization functions in Pyro, which seem to have been supposed to be used to model the covariance of time-series models. I’ve never seen these parameterization functions being discussed in the literature before, and I’m curious how these came about, who first came up with these, and whether they were discussed in the literature and such. Any pointers?
Our motivation for the DiscreteCosineReparam and HaarReparam was to improve posterior accuracy in time series models, specifically epidemiological models we were building in collaboration with Lucy Li. Our intuition was that the posteriors were highly correlated over time, so that naive diagonal normal variational models performed poorly, but by changing coordinates via a DCT or Haar wavelet transform, the diagonal approximation was a much better fit (diagonal in these new coordinates). Moreover, the low-rank multivariate normal was an even better fit (spending its limited capacity on corrections to the already good diagonal model), and HMC moves were empirically able to move farther in the correlated space. I haven’t seen any publications about effect-based reparametrizers + DCT or Haar wavelets, but the two reparametrizers are straightforward instances of the recipe: take a coordinate transform and implement it as a reparametrizer.