Independence of Delta Distribution


Consider the difference between Delta and Dirichlet distributions:

dist1 = dist.Delta(0.5*torch.ones([2]))
dist2 = dist.Dirichlet(0.5*torch.ones([2]))

The Delta distribution has an event_shape=0 and a batch_shape of 2.
The Dirichlet distribution has an event_shape=2 and a batch_shape of 0.

Why is the behavior different between the two distributions? I thought that dependence of variables was assumed unless stated otherwise?

How would I create a sample off of the dist.Delta above such that the event_shape is 2? I cannot figure this out, and yet it is needed if I had a model with the Dirichlet distribution above and a guide with the Delta distribution. This is done automatically with AutoDelta, and I am trying to understand what really happens under the hood. Thanks.


To move dimensions from a distribution’s batch_shape to its event_shape, use .independent(...):

dist1 = dist.Delta(0.5*torch.ones([2])).independent(1)

For more details and background, see the tensor shape tutorial.