I use numpyro and the NUTS sampler to train several 1000 of models using historical data for say a year worth of data each. As this takes many hours I’d like to know how to know the best practice about persisting the model say to a pickle file. I use the posterior and the model. Then at a later date when new data has accumulated to reload the model and posterior and use it as a prior for training the model using the new data.
In this and later steps can I/should I skip the warm up step ?
Is there some recommended metric/diagnostic to estimate if the the new data’s drift from the prior distribution.
Then at a later date when new data has accumulated to reload the model and posterior and use it as a prior for training the model using the new data.
NUTS works on a (differentiable) model density. posterior samples are samples, basically a set of delta functions. in order to create a prior from previously collected posterior samples would require fitting a density to the posterior samples. needless to say, learning an appropriate density is hard in general, and wouldn’t necessarily save you any compute time.
a more straightforward approach is to reuse the mass matrix and step size from previous NUTS runs. this will make subsequent runs a bit faster since you can reduce the burn-in/adaptation period accordingly. however this will not result in gigantic speed-ups.