Beginner question on speed of NUTS sampler for DP Mixture model

Hi there,

Modelling setting
I am modelling a bunch of 1D continuous grouped data with a Dirichlet Process Mixture model (DP). I use a DP to set an a-priori level of conservativeness for the number of components. I am currently using a truncated Normal as component distribution. The prior of the location of this distribution is set to a normal with location set to the mean of the observations and the prior of the scale is a Uniform(0.1, 20). We break non-identifiability with an argsort operation on the means of the component distributions.

I am modelling 76 groups of all 200 samples and I am using 5 components, 100 warmup steps, 1000 samples.

Questions I have
When I run the NUTS sampler everything goes well in the sense that my posterior predictive seems like a good fit, but I find it hard to reason on the speed of the sampler. What I observe is that the acceptance probability is high, but sampling can be quite slow and sometimes miraculously speeds up. I am wondering what could be the reasons for this volatile speed of the sampler.

Could it be due to unfortunate prior choices? Or unfortunate family of component distributions? Or are there hyperparameters of the NUTS sampler I should take into account?

I am new to NumPyro and Bayesian modelling in general, so I am mainly asking for pointers to get more into the theory as most sources I find online are not so easy to get into without substantial prior knowledge.

Any help / pointers are welcome, thanks!


hard to say without more details.

but argsort sounds non-differentiable. nuts expects differentiable joint densities. you might try enforcing identifiability differently.

also note that nuts is an adaptive algorithm. it is expected to have variable speed as it explores e.g. easier/harder regions of the posterior.