# How to ignore nan values when do hierachical forecast?

I find http://pyro.ai/examples/forecasting_iii.html do not cover a problem : nan in real data .
For example :
in 2010, there was only 48 station
in 2011, 3 station was closed and 5 new opened, 50 stations .
in 2012, 2 station was closed in 2011 reponed

For another example :
My data is salecount of various products in many stores .
I have reshape the szie to `torch.Size([44, 103, 671, 1])` , means:
44 stores, 103 products , 671 days salecount . Some stores may be closed in different days, so as products would be off.shelf by many reason .

random 10 products history salecount :

Prediction on store product level is poor:

Create matrix must have nan values , and

• We can’t fill them by 0 because they are different to true 0 .
• We should not take nan values into count.
• We can’t drop the nan when trainning , because that break timeseries order .

These are real cases .

Trainning is happen at here .

``````class ForecastingModel(PyroModule, metaclass=_ForecastingModelMeta):
....
def predict( ...
...
if t_obs == t_cov:  # training
pyro.sample("residual", noise_dist, obs=data - prediction)
self._forecast = data.new_zeros(data.shape[:-2] + (0,) + data.shape[-1:])
``````

So, could we add a mask at here , only use not nan values for trainning ?

Hi @twinmegami,
I would recommend masking out NAN values using either poutine.mask(), or if you’re not using `pyro.plate` then the distribution .mask() method. Take care that the actual data values are not NAN but rather some plausible value like zero: Pyro will ignore them if you mask them out, but PyTorch has weird behavior and may produced NAN grads unless those ignored values are finite.

Here’s a rough example

``````class Model(ForecastingModel):
super().__init__()
def model(self, zero_data, covariates):
...  # as in https://pyro.ai/examples/forecasting_iii.html
obs_scale = pyro.sample("obs_scale", dist.LogNormal(-5, 5))
noise_dist = dist.Normal(0, obs_scale.unsqueeze(-1))
self.predict(noise_dist, prediction)
``````

Feel free to paste part of your actual model code in the dense case, and we can try to help adapt that to a masked version.

Sorry, there is some bug in my code , above problems are fixed .

Current problem is

``````  File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/contrib/forecast/forecaster.py", line 130, in predict
noise_dist = reshape_batch(noise_dist, noise_dist.batch_shape + (1,))
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/functools.py", line 840, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/contrib/forecast/util.py", line 278, in _
base_dist = reshape_batch(d.base_dist, base_shape)
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/functools.py", line 840, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/contrib/forecast/util.py", line 272, in reshape_batch
raise NotImplementedError("reshape_batch() does not suport {}".format(type(d)))
NotImplementedError: reshape_batch() does not suport <class 'pyro.distributions.torch_distribution.MaskedDistribution'>
``````

go to reshape_batch function

``````@singledispatch
def reshape_batch(d, batch_shape):
"""
EXPERIMENTAL Given a distribution ``d``, reshape to different batch shape
of same number of elements.

This is typically used to move the the rightmost batch dimension "time" to
an event dimension, while preserving the positions of other batch
dimensions.

:param d: A distribution.
:type d: ~pyro.distributions.Distribution
:param tuple batch_shape: A new batch shape.
:returns: A distribution with the same type but given batch shape.
:rtype: ~pyro.distributions.Distribution
"""
raise NotImplementedError("reshape_batch() does not suport {}".format(type(d)))
``````

I tried to register as below

``````@reshape_batch.register(dist.MaskedDistribution)
def _(d, batch_shape):
base_dist = reshape_batch(d.base_dist, batch_shape)
``````

But got this error when running :

``````  File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/contrib/forecast/forecaster.py", line 289, in __init__
elbo._guess_max_plate_nesting(model, guide, (data, covariates), {})
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/infer/elbo.py", line 109, in _guess_max_plate_nesting
model_trace.compute_log_prob()
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/poutine/trace_struct.py", line 221, in compute_log_prob
.format(name, exc_value, shapes)).with_traceback(traceback) from e
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/poutine/trace_struct.py", line 216, in compute_log_prob
log_p = site["fn"].log_prob(site["value"], *site["args"], **site["kwargs"])
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/torch/distributions/independent.py", line 88, in log_prob
log_prob = self.base_dist.log_prob(value)
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/pyro/distributions/torch_distribution.py", line 303, in log_prob
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/torch/distributions/normal.py", line 72, in log_prob
self._validate_sample(value)
File "/home/ufo/anaconda3/envs/dl/lib/python3.7/site-packages/torch/distributions/distribution.py", line 253, in _validate_sample
raise ValueError('The value argument must be within the support')
ValueError: Error while computing log_prob at site 'residual':
The value argument must be within the support
``````

It is because of the nan in input data .
Do you mention nan due to this ? So I need a not nan input , with a mask , is this correct ?

Then , I fill nan to zero, trainning is successful .

But how to do with `Forecaster` ? It doesn’t accept `mask` parameters .

``````pyro.set_rng_seed(1)
pyro.clear_param_store()
# test_data = torch.Tensor(msc)
test_data = torch.Tensor(np.nan_to_num(msc))

covariates = torch.zeros(test_data.size(-2), 0)
learning_rate=0.1, learning_rate_decay=1, num_steps=501, log_every=50)

samples = forecaster(test_data[..., T0:T1, :], covariates[T1:T2], num_samples=100)
samples

# here tensor([], size=(44, 103, 0, 1))
``````

Actually `Forecaster` class is strange , `__call__` method first argument `data` seems take no effect in prediction , in tutorials the data length is not equals to covariates length , though it can produce correct result .

Hi @twinmegami,
Thanks for clearly reporting this bug. I’ve tried to fix it in the forecast-mask branch. Could you see if that works for you?

`Forecaster` class is strange , `__call__` method first argument `data` seems take no effect in prediction

I agree the signature is a little unusual. The motivation is that we need to pass a prototype tensor to the `.__call__()` method during training. This tensor basically bundles metadata (.shape, .dtype, .device). Since the model is generative, it should not look at the actual data it is supposed to be generating --it should only look at the metadata. That’s why we pass in `torch.zeros_like(data)` rather than the actual data. Note I think you may have accidentally inverted the mask previously (True should mean observed, False should mean missing), so that may have led to all data being ignored.

The reason `data` can be different length than `covariates` is exactly for forecasting: we might observe three weeks of data and want to forecast forward one more week. We could add covariates for all four weeks. The difference in length (4 weeks of covariates - 3 weeks of data = 1 week) is exactly the size of the window you’d like to predict.