I am interested in estimating the posterior of a model from a weighted data set.
Every data point has an associated weight. For a data point with weight W, I want the inference algorithm to update the posterior as if there were W copies of that data point in the observed data set (even if W is not an integer).
Is there a way to weight data points during inference in Pyro, and if not, how difficult would it be to extend Pyro to do this?
I believe this functionality should be a priority for Pyro going forward, because weighted coreset methods allow extending Bayesian approaches to much larger datasets. See: [1710.05053] Automated Scalable Bayesian Inference via Hilbert Coresets
1 Like
You should be able to use pyro.poutine.scale
, which multiplies log-probabilities by a constant, for that:
def model(data):
...
latent = pyro.sample("latent", ...)
...
with pyro.plate("data", N), pyro.poutine.scale(scale=weights_tensor):
...
pyro.sample("observed", ..., obs=data)
...
1 Like
Is there currently a way (or workaround) to do sample-wise weighting with TraceEnum_ELBO for discrete latent variables?
Hi @dschneider, I’m not sure what you mean exactly by “sample-wise weighting” but the code above should work correctly for both discrete and continuous random variables and be compatible with TraceEnum_ELBO
.
Hi, sorry for being unspecific. Using poutine.scale(scale=weights)
with weights
being a tensor with a different weight for each sample in the batch in conjunction with TraceEnum_ELBO
currently (v1.8.1) gives
ValueError: enumeration only supports scalar poutine.scale
I found a similar question here:
https://github.com/pyro-ppl/pyro/issues/1897
Is there another way to achieve the same functionality?
Here some short example of my motivation:
Lets say I have n
uncertain observations, e.g. with different discrete distributions P(X_i=a) = p_i, P(X_i=b) = 1 - p_i
for each i < n
. How to use these observations for model (Bayesian network) training? In my understanding I could sample k
data points from each of those distributions, which would bloat the dataset to size n * k
and only be sufficiently accurate for 1 << k
. Or I could be exact and split each sample i
into two samples (X_i1 = a
with weight p_i
and X_i2 = b
with weight 1 - p_i
) bloating the dataset size only to 2n
in this case. Or is there a better way to use discrete distributions as observations?
Yes, you’re right, I’d forgotten we hadn’t implemented support for that in TraceEnum_ELBO
, though it is mathematically valid.
In your particular case, it sounds like what you actually want is something like the following, which is compatible with enumeration:
xi_dist = ... # generative model's conditional distribution of X_i
xi_obs_is_a = pyro.sample("xi_obs_is_a", Bernoulli(p_i)) == 1
pyro.sample("xi", xi_dist, obs=torch.where(xi_obs_is_a, a, b))
See also this old issue for further discussion of distribution-valued observations.
1 Like