Saving intermediate steps?


I have a use case of HMC inference which demand ~24h on a single GPU and it may happen that the job is halted by the resource manager, do you know if there is a way

  1. to save the JIT compilation state and resume to proceed to the run() afterwards?
  2. and if I ask for 5,000 samples (after warm-up), to save the samples by 1,000 batches?


I think you can use post_warmup_state for this: just perform 5 to get samples in batches then concatenate.

Ha. fine @fehiepsi , but is there a numpyro.save_state(<file name with an extension>, mcmc.post_warmup_state) and symmetrically a state = numpyro.load_state(<file name with an extension>) ?

Currently, we don’t have support for it. I think using post_warmup_state is convenient enough. You can build up your own pipeline from it.