SVI with aggregated observations

Hello:
I need to do SVI, as in the tutorial

https://pyro.ai/examples/svi_part_i.html

but the data is aggregated. For clarity, let’s talk about the tutorial itself. The observations there are not aggregated:

``````# create some data with 6 observed heads and 4 observed tails
data = []
for _ in range(6):
data.append(torch.tensor(1.0))
for _ in range(4):
data.append(torch.tensor(0.0))
``````

but clearly, such array of observations carries the same information as the aggregated version, which could come for instance in the following format:

``````data_frequencies = torch.tensor([4,6])
data_values = torch.tensor([0,1])
``````

Furthermore, as is my case, sometimes the data comes aggregated.
Aggregated data can sometimes be disaggregated in a canonical way. Not so much with my data, since we weight past observations less than recent ones, so that instead of `data_frequencies` we talk about `data_weights`.

And even if it is possible to disaggregate the data, it seems that it would be more efficient to keep it aggregated and write the log posterior with the data aggregated.

Is there a way to pose this problem without disaggregating the data?

I made the following change to the model:

``````def model(agg_data):
# define the hyperparameters that control the Beta prior
alpha0 = torch.tensor(10.0)
beta0 = torch.tensor(10.0)
# sample f from the Beta prior
f = pyro.sample("latent_fairness", dist.Beta(alpha0, beta0))
# loop over the observed data
counter = 0
for row in agg_data:
freq, value = row
# observe datapoint i using the Bernoulli
# likelihood Bernoulli(f)
for _ in range(int(freq)):
pyro.sample("obs_{}".format(counter), dist.Bernoulli(f), obs=value)
counter += 1
``````

and the corresponding to the training data:

``````data = torch.tensor([
[4.,0.],
[6.,1.]
])
``````

I get the same solution. Any thoughts? Do you think this code can be improved?

please refer to this tutorial Tensor shapes in Pyro — Pyro Tutorials 1.9.0 documentation

1 Like