Understanding Effective Sample Size

I was pleased seeing that both ESS and Gelman-Rubin diagnostics are available in Pyro. I have a question regarding the usage.

With reference to pyro/stats.py at master · pyro-ppl/pyro · GitHub, if I have a batch of Markov Chains (say `chains`) with dimension `L x B x D`, `B` is the number of batches, `L` is the length of each chain and `D` is the dimension of the support.

Is this the correct usage?

``````import pyro.ops.stats as stats

stats.effective_sample_size(chains, chain_dim=0, sample_dim=-1)
``````

Weirdly, I am getting negative values for certain trials and want to make sure I understand the semantics of this routine correctly.

cc @fehiepsi (since you’ve implemented those parts.)

Is `L` the number of chains? If so then you are correct.

Also note that @fehiepsi has recently added documentation to these diagnostics. Feel free to submit an issue or PR to clarify the docs

`L` is the length of the chains. `B` is the batch size (number of chains).

After your comment, I tried the following. `chains` (`50 x 1000 x 2`) is a tensor containing 50 independent chains each of length 1000 from a 2-D Gaussian.

Running the routine I get a result of 1000-D tensor. Is this the right output? If I understand ESS correctly, shouldn’t we have ESS for each dimension of each chain?

Additionally, I am getting negative values for ESS. Is that expected?

Glad to help once I get this conceptually clarified!

`chain_dim` is the dimension for your chains.
`sample_dim` is the dimension for your samples.
If you have a tensor L x B x D where L, B, D are interpreted as: B chains, each chain contains L samples, each sample has shape D, then `chain_dim=1`, `sample_dim=0`. We enumerate `dim` from left to right, as in the usual PyTorch tensor.

I guess that your `batch_dim` is `chain_dim` and your `length_dim` is `sample_dim`? I have thought that by length of the chains, you mean the number of chains.

Ok, I think I understand this. I have questions more on the conceptual side now.

Using a `1000 x 50 x 2` sized tensor which represents 50 chains containing 1000 samples each,

``````# chain_x: 1000 x 50 x 2
stats.effective_sample_size(chain_x, chain_dim=1, sample_dim=0)
``````

I get the following result,

``````tensor([123393.8203, 126691.3281])
``````

I was expecting a 2-D tensor (representing ESS per dimension from the sample space) and this seems to fall in line with my understanding.

However, I don’t know how to interpret these numbers. I was thinking that there is no way ESS can be greater than 1000. Am I missing something here?

Also, if I report ESS in results, do I report the `min` among all dimensions?

Edit: I generated the chains using my own implementation of HMC and have visualized one of them below. It seems to be doing reasonable so I am hoping at least my samples are reasonable.

This thread answers for the question why ess > num samples: bayesian - Effective Sample Size greater than Actual Sample Size - Cross Validated
It will happen when your samples are negative correlated. You can test against arviz to confirm things are calculated correctly. About ‘min’ of ess, I don’t know if it is a right way to do.

Thanks for that!

Is there a reading you’d recommend for these diagnostics? I still feel there’s a gap in my knowledge and don’t feel too comfortable with these statistics.

I think that Stan Manual is a good reference (in addition to Google ).

1 Like

Thanks! They had nice summaries in there.