Estimating missing variable using Deep Markov Model (DMM)

Hello there, I’m new on probabilistic programming, and I’ve followed the Deep Markov Model tutorial (https://github.com/uber/pyro/tree/dev/examples/dmm), and I’m now testing the model on a custom dataset. The dataset I’m using is The “Appliances Energy Prediction Data Set”, uploaded on UCI Machine Learning Repository here.

What I’m trying to do here is to predict/estimate the “Appliances Energy Use” after training the model on the dataset, but I’m at loss on what I’m supposed to do. I’ve put all the variables on the dataset, including the aforementioned “Appliances Energy Use” (except time) as the state variables/observations (x) during training, but I don’t know how to test the trained model on a sequence of observations, with the variable I want to predict (“Appliances Energy Use” in this case) ungiven (In other words, I’m trying to infer a missing variable given a set of observations).

What I want to ask is:

  1. Can DMM do this? As I’m new to the workings of this model and probabilistic programming, I might have misunderstood the model and thought this kind of inference is possible.
  2. If it can, what is the steps I need to take to do this? I’m thinking of using all the variables, minus the missing variable, as the input of guide, and using the inferred latent variables as input to the model to emit the whole observations (as it includes the missing variable, this means predicting the variable too).

I’m using Pyro v0.3.0 and Pytorch v1.0. Any help is appreciated!

can you please be much more specific about the exact form of the dataset and modeling setup you’re interested in? otherwise it’s difficult to be of much help.

I seemed to have solved the problem by myself, but I’m not really sure whether my approach is right or not. But I’ll try to explain in details what I’m trying to achieve here.

On the dataset, there are 29 variables listed below (the names are taken directly from the file header):
date time, Appliances, lights, T1, RH_1, T2, RH_2, T3, RH_3, T4, RH_4, T5, RH_5, T6, RH_6, T7, RH_7, T8, RH_8, T9, RH_9, To, Pressure, RH_out, Wind speed, Visibility, Tdewpoint, rv1, rv2

I then used all of the variables as observations (x), meaning I set “Appliances” as first element of x (x_1), “lights” as the second element (x_2), “T1” as the third element (x_3), and so on. Here is a snippet of the form of the dataset. The vertical axis represents time.

I processed the dataset to follow the format of the tutorial (Deep Markov Model — Pyro Tutorials 1.8.4 documentation), and the only changes I made to the model is to set the Emitter function into producing mean and variance of the Normal distribution of the dataset, since the dataset consists of real values.

What I’m trying to do here, is to train the model on the dataset, with all the variables intact, but then remove one of the variables during evaluation and make the model infer/predict the missing variable from the latent states produced by the guide.

To achieve this, I changed the guide to take all the variables minus the missing variable, and then reintroduce all the variables (including the missing one) to the model during emittance. I’m training the model right now, and the NLL seems to be decreasing. Don’t know whether it’ll work or not though.

If there are some details that are still unclear, I’ll try to clarify them. Thanks for the help!

do you expect the missing variables to affect the entire time series or do you expect the missing variable(s) to be missing at random (e.g. absent at time steps 3 and 5 but present at 1 2 4 and 6)?

I expect the missing variables to affect the whole time series. If that’s the case, should the missing variables be classified as latent variables?

yes, probably. but you need to be careful about your problem setup. presumably the missing variable is present in the training data but missing in the test data (otherwise how could you ever infer it in a grounded way on test data where it is missing?). the variable should be an observed latent variable during training and an unobserved latent variable that needs to be inferred at test time.

Yeah. That’s exactly what I was trying to achieve. I trained with the missing variable present in the observation emitted by the generator network, but not as an input to the guide. Is this the right way to do it?

I have one more question, should I put the missing variable as latent variable or as missing observation? If I should put it as a latent variable, how do I add the latent loss to the current ELBO loss?

Thanks in advance!