ELPD calculation problem

jim · April 10, 2024, 3:32pm

Hi again,

I’m having some difficulty with an ELPD calculation for the model that I’m working on. The model is a discrete mixture model that seems to work well, inasmuch as the fit parameters match well with parameters used to generate a test dataset and posterior predictive checks seem okay to me.

I’m going to need to do some model selection as I will need to fit several models with different numbers of clusters and pick the best one, so I thought I would use ELPD for that. The problem is that when I calculate the ELPD using the loo function in arviz I get a load of warnings about the Pareto distribution diagnostics:

UserWarning: Estimated shape parameter of Pareto distribution is greater than 0.7 for one or more samples. You should consider using a more robust model, this is because importance sampling is less likely to work well if the marginal posterior and LOO posterior are very different. This is more likely to happen with a non-robust model and highly influential observations.

The code is here, on a google colab notebook. I’d be grateful for any tips to solve this.