Are divergences discarded?

When running a model with different seeds, I’ve observed that often, more divergences mean lower rhat score. I would have expected the opposite to be the case, but I’m not an expert with Hamiltonian Monte Carlo, so maybe someone can answer whether (a) this is because of some counterintuitive fact about Hamiltonian Monte Carlo (I’m using NUTS by the way), or (b) because divergences are discarded and don’t count for the rhat score.


In a NUTS step, if we observe divergence, we will stop expanding the trajectory. The proposal at that step is still likely a new sample (the chance to stay at the previous sample can be small), but might be a bit correlated to the last sample. I’m not sure if that’s related to Rhat, which is used to evaluate if chains are mixing well. I guess this article might be helpful for you: it shows that even with good rhat, divergences can still be large and that indicates some problems with the model.

1 Like

Thanks, I’ll look at this article!