MCMC for BNN with big data?

I’m trying to create a BNN with a lot more data than I can hold in-memory. I was looking at HMCECS to handle this initially, but it seems even using HMCECS requires the full data to be in-memory so it can be supplied in the mcmc.run(). I will try using SVI in the mean time, but I fear the quality of the results will suffer for it.

Is there a way to run MCMC with mini-batches, such that data can be read in from the file system when required?

You might want to use consencus MCMC Tutorial or example on embarrassingly parallel/consensus MCMC · Issue #417 · pyro-ppl/numpyro · GitHub See also numpyro/notebooks/source/covtype.ipynb at 07c801d420b87bf79bd10fb790cb409158b257e5 · pyro-ppl/numpyro · GitHub for some ideas.

wow! I did not know about this! Maybe I could work on making these examples more visibles :slight_smile:

Please feel free to take it over. I remember the approach worked well, the only blocker was to organize the content in a nice way.