Hi @martinjankowiak,

Thanks for your interest in answering my question. I will try to better contextualize what Iâ€™m trying to do.

I am working with spatiotemporal dataset of multiple measurements that are collected across a city. Basically, it is a dataset where I have multiple time series which are spatially correlated. Since some locations of the city are not covered during the data collection process, I need to build a model that gives me an estimate of this variable in these locations where I have missing measurements. Besides, the model must be able to update its outputs when new (streaming) data arrives, in an incremental fashion.

I tried some other ML approaches, but after some research, it seems to me that a bayesian approach would be the most suitable here, due to the nature of the problem. In addition, I was trying to model this dataset as a GP because it allows me to model the spatiotemporal dependency between my measurements explicitly. It is a nice way to correlate the outputs of a model (based on the distance between the locations in which these measurements are collected, for example). I donâ€™t know if Iâ€™m missing some obvious approach in this case, but thatâ€™s how I imagined solving the problem.

Right now I am working with a GP where its multivariate normal distribution is 1200-dimensional, but this dimensionality could be increased in the future. I could get a first model working in PyMC3 but I couldnâ€™t manage to do a bayesian updating/incremental learning with this model in an efficient manner. If I could have a simple GP model that performs this incremental updating (posterior -> new prior -> posterior -> new prior -> â€¦), this would already be a good advance for me.

Thanks again for your reply!