Hi,
I have a use case of HMC inference which demand ~24h on a single GPU and it may happen that the job is halted by the resource manager, do you know if there is a way
- to save the JIT compilation state and resume to proceed to the run() afterwards?
- and if I ask for 5,000 samples (after warm-up), to save the samples by 1,000 batches?
Thanks.
I think you can use post_warmup_state for this: just perform 5 mcmc.run(...)
to get samples in batches then concatenate.
Ha. fine @fehiepsi , but is there a numpyro.save_state(<file name with an extension>, mcmc.post_warmup_state)
and symmetrically a state = numpyro.load_state(<file name with an extension>
) ?
Thanks
Currently, we don’t have support for it. I think using post_warmup_state
is convenient enough. You can build up your own pipeline from it.