Hi everybody,
I’ve been running a large model using NUTS. I find the the maximum treedepth is hit very regularly. To increase statistical efficiency (in terms of effect sample size), it’s usually recommended to increase the treedepth, and therefore number of the steps that NUTS takes. I decided to give this a go, and as expected, the statistical efficiency is much better, but the runtime is much worse.
However, what I find surprising is that the model uses very large trees at the start of the warmup phase, with a treedepth of up to 17, but not when sampling. For example, the image shows the treedepth over time, where purple is during sampling and red is during warmup.
This makes me think that the geometry of the posterior is actually okay, but that NUTS is starting very far from the typical set, and is generally having a hard time adapting. The initial point is the prior median at the moment.
However, I’m not sure if my diagnosis is correct. Any suggestions would be much appreciated!