Hello devs.

I’m using NUTS sampler. I have noticed that whenever the posterior is bimodal, the rhats are greater than 1 (1.04 - 1.08). Because my data is sparse, it makes sense for the model to output bimodal distribution. Should I be worried about the rhat values in this case? Or, is it ok for me to use the posterior samples even when rhats are in range 1.04 - 1.08 given that my data is sparse?

Also, I’d prefer to have unimodal posteriors over bimodal ones. So, are there any general tips on how to avoid bimodal posteriors with sparse data?

1.08 isn’t terrible but yes if your goal is a high-fidelity approximation of the posterior and the posterior is multi-modal then the inference problem can be quite hard and large rhats may result. the degree to which you care depends on how much you care about the posterior itself vs something downstream like some predictive performance. note that in some cases increasing the number of warm-up steps and the total number of samples may help.

So, are there any general tips on how to avoid bimodal posteriors with sparse data?

i’m afraid this is far too vague

1 Like