Origin of the DCT, Haar reparameterizations in Pyro

Hi,

I’m a PhD student studying the properties of black-box variational inference (called SVI in Pyro).
I’ve noted the DCT and Haar reparmeterization functions in Pyro, which seem to have been supposed to be used to model the covariance of time-series models. I’ve never seen these parameterization functions being discussed in the literature before, and I’m curious how these came about, who first came up with these, and whether they were discussed in the literature and such. Any pointers?

Thanks!

Hi @kyrkim,

When we read Maria Gorinova’s reparametrization paper we implemented a collection of reparametrizers in that paper’s formalism. Some of our reparametrizers are well known (e.g. LocScaleReparam, NeuTraReparam), some implemented old ideas but newly expressed in the paper’s formalism (e.g. GumbelSoftmaxReparam, ProjectedNormalReparam, DiscreteCosineReparam), and some were new ideas (e.g. StableReparam).

Our motivation for the DiscreteCosineReparam and HaarReparam was to improve posterior accuracy in time series models, specifically epidemiological models we were building in collaboration with Lucy Li. Our intuition was that the posteriors were highly correlated over time, so that naive diagonal normal variational models performed poorly, but by changing coordinates via a DCT or Haar wavelet transform, the diagonal approximation was a much better fit (diagonal in these new coordinates). Moreover, the low-rank multivariate normal was an even better fit (spending its limited capacity on corrections to the already good diagonal model), and HMC moves were empirically able to move farther in the correlated space. I haven’t seen any publications about effect-based reparametrizers + DCT or Haar wavelets, but the two reparametrizers are straightforward instances of the recipe: take a coordinate transform and implement it as a reparametrizer.

2 Likes

Thanks for the explanation!