New thread: setting a Mask?

junbin.gao · January 6, 2021, 11:20pm

Hi All

Is there any example of setting mask in a model when dealing with missing data? For example, some likelihood terms should be not calculated due to missing y.

J

fritzo · January 7, 2021, 4:39am

Hi @junbin.gao, many of the examples and tutorials use masking:

$ grep -Rl mask examples tutorial | grep '\.py\|\.ipynb$'
examples/cvae/baseline.py
examples/cvae/util.py
examples/cvae/cvae.py
examples/cvae/mnist.py
examples/sir_hmc.py
examples/air/air.py
examples/air/main.py
examples/contrib/funsor/hmm.py
examples/mixed_hmm/seal_data.py
examples/mixed_hmm/model.py
examples/scanvi/data.py
examples/dmm.py
examples/hmm.py
examples/capture_recapture/cjs.py
tutorial/source/effect_handlers.ipynb
tutorial/source/boosting_bbvi.ipynb
tutorial/source/dmm.ipynb
tutorial/source/cvae.ipynb
tutorial/source/air.ipynb
tutorial/source/tracking_1d.ipynb
tutorial/build/doctrees/nbsphinx/effect_handlers.ipynb
tutorial/build/doctrees/nbsphinx/boosting_bbvi.ipynb
tutorial/build/doctrees/nbsphinx/dmm.ipynb
tutorial/build/doctrees/nbsphinx/air.ipynb
tutorial/build/doctrees/nbsphinx/tracking_1d.ipynb

junbin.gao · January 9, 2021, 12:59am

Thanks Fritzo. However I did not find an appropriate example. For example the case in CVAE is not something I am after. Here is an example that I want. Suppose y is an observed vector with some missing value, I hope I can pass on this information to sample function, so that the likelihood at that location wont be calculated.

Thanks

J.

dave · January 12, 2021, 2:03pm

You can wrap the sample site with pyro.poutine.mask in which you pass a masking tensor. That is what @fritzo is talking about. That’s all there is to it.

Input:

def mask_model(y, mask,):
    loc = pyro.sample("loc", dist.Normal(0.0, 1.0))
    with pyro.poutine.mask(mask=mask):
        obs = pyro.sample("obs", dist.Normal(loc, 1.0), obs=y)
    return obs


N = 20
y = torch.randn((20,))
mask = torch.randint(high=1 + 1, size=(N,)).type(torch.BoolTensor)

trace = pyro.poutine.trace(mask_model).get_trace(y, mask)
trace.nodes['obs']

Output:

{'type': 'sample',
 'name': 'obs',
 'fn': Normal(loc: -2.9471242427825928, scale: 1.0),
 'is_observed': True,
 'args': (),
 'kwargs': {},
 'value': tensor([-0.7209,  0.3980, -0.2284, -2.1061,  1.5540, -0.5424,  0.4061,  0.1533,
          0.6535, -0.0964, -0.4650, -2.1123, -0.6702,  2.0065,  1.4525,  0.2473,
          0.0139,  0.9315,  0.6688,  0.7431]),
 'infer': {},
 'scale': 1.0,
 'mask': tensor([ True, False,  True, False, False,  True, False, False,  True, False,
         False,  True,  True,  True,  True, False, False, False, False,  True]),
 'cond_indep_stack': (),
 'done': True,
 'stop': False,
 'continuation': None}

junbin.gao · January 12, 2021, 9:04pm

Dave, this is really helpful. Thank you so much.

One more further question: In your example, we only have batch shape. If my observed y is a Gaussian vector where some components inside y are missing, does Pyro can calculate logprof for it? Or will a partial logprof be calculated?

J.

jmabry · July 12, 2024, 11:06pm

I wanted to validate for myself how to properly do masking with NumPyro. I wrote up a toy problem and validated that the inference converged correctly. Check it out! Masked Observations with Numpyro – PainpointsPurelyTechnical