If your variables are not conditionally independent (e.g. if a downstream observe statement combines values among either iarange), then you have two options.
The cleaner option is to convert one parallel iarange into a sequential irange:
@config_enumerate(default="parallel")
def model_and_guide():
logits = torch.tensor([[1.,2.,3.], [4.,5.,6.]])
with pyro.iarange('y_axis', 2):
values = []
for i in pyro.irange('x_axis', 3):
value = pyro.sample('limit_{}'.format(i),
dist.Bernoulli(logits=logits[:, i]))
values.append(value)
# This observe statement forces us to sequentially sample along x_axis:
pyro.sample('aggregate', dist.Binomial(3, 0.1), obs=sum(values))
The second option is to manually compute the cartesian product by replacing the Bernoulli with a Categorical over the product space; this is kind of gross.
Yes, that is to be expected: each new variable needs to be enumerated in a different dimension. Broadcasted together they form a cartesian product (whose volume grows exponentially with the number of variables, as expected). Often you can work with these differently-shaped tensors efficiently without broadcasting. I’d recommend reading (or re-reading) the Tensor Shapes Tutorial to help understand how Pyro’s enumeration works.
I’ve read the Tensor Shapes Tutorial several times already, I’ll go through it one more time :-). I’m still trying to develop a good mental model for what Pyro is doing. Is there a good paper that walks through the basic algorithms?
In this case, isn’t the point of declaring independence to avoid doing the cartesian product? i.e. you only need two parallel searches of R^3 instead of a cartesian search of R^3 x R^3
We’re working on a tutorial and tech report about enumeration. Let us know what you’d like to see in those
isn’t the point of declaring independence to avoid doing the cartesian product?
Yes. I’m happy to explain if you post a more complete model, ideally a complete model, guide, and training loop. There are different tricks to avoid the cartesian product, and to recommend a trick we’ll need to se how you consume values downstream of the sample statements.