I am trying to replicate the behavior of least squares to predict multiple features for multiple series at once. Below is a toy example of the problem.

The pyro code here runs successfully, but generates a nonsense answer. It’s also rather slow. It doesn’t have any partial-pooling, which I plan to add after I get past this data shaping hurdle.

```
import numpy as np
import pyro
import pyro.distributions as dist
import torch
from pyro.infer.mcmc import NUTS, MCMC
# simulate 2 time series made from 3 common features with 20 observations, no noise for the sake of simplicity
n_obs = 20
x = np.vstack([np.arange(0, n_obs), np.arange(0, 2 * n_obs, 2), np.random.normal(size=n_obs)]).T
# each series has different coefficients, unpooled or partially-pooled
coefs = np.array([[0.5, 1, 1], [1, 2, 0.5]])
y = np.dot(x, coefs.T)
##### NUMPY LSTSQS
est_params = np.linalg.lstsq(x, y, rcond=None)[0]
############## PYRO Approach
def model(x, y):
# hyperprior = pyro.sample("u", dist.Normal(0., 100))
# sigma_hyper = pyro.sample("s", dist.HalfNormal(100))
dim = y.shape[1]
b_r = pyro.sample("bR", dist.HalfNormal(torch.ones(dim)).expand([x.shape[1], dim]).to_event(1)) # is shape (x_features, num_series)
# sigma = pyro.sample("sigma", dist.HalfCauchy(0.5).expand([1]).to_event(1))
mean = torch.matmul(x, b_r) # [series]
with pyro.plate("data"):
pyro.sample("obs", dist.Normal(mean, 1.0), obs=y)
kernel = NUTS(model)
mcmc = MCMC(kernel, num_samples=2000) # warmup_steps, num_chains
mcmc.run(
torch.from_numpy(x.astype(np.float32)), torch.from_numpy(y.astype(np.float32))
)
posterior_samples = mcmc.get_samples()
bayes_est_params = posterior_samples['bR'].mean(axis=0).numpy().T
```

Any help is appreciated, thanks!