I would post code, but I’m honestly not sure how to start.

I have a situation in which I have T similar tasks (plausibly independent, but for each task, the inputs and outputs are the same. E.g, the temperature over time of T different machines)

It’s somewhat expensive to model this with T independent Gaussian Processes and I’m very sure that many of the GPs will end up with very similar parameters. I would like to make this into a multi-task problem.

My understanding is that I can create one long vector X with all the time inputs stacked vertically, and use one-hot encoding to map each input to each task. Then my input X has dimension (total_samples, T +1) for T OHE vectors and 1 actual input, and Y has dimension (total_samples,)

Since now the data is very large, I would like to use VariationalSparseGP to speed up computations. My question is how do I formulate the inducing points Xu and how should I formulate my kernel?

I would like to specify that each task may have a different number of samples, but only one well defined input (time, for example)