I’m trying to run a regression. The data quality changes a function of time so, for example, in the beginning of time a few of my contexts will have enough data to be used and by the end all of them will. I was wondering if there is a simple way to mask out some of the regression variables in a vector and selectively block them from the regression. Here is a sample code below:
def model_regression_test(goals, contexts, sigma):
dim = contexts.shape[1]
mask = [True for i in range(dim)]
for i in range(dim):
if numpy.count_nonzero(contexts[:, i]) < 10:
mask[i] = False
assert sigma.shape == (dim, dim), (sigma.shape, dim)
beta = numpyro.sample("beta", dist.MultivariateNormal(loc=jax.numpy.zeros(dim), precision_matrix= sigma ))
with numpyro.plate("samples", len(goals)):
numpyro.sample("goals", dist.Normal(jax.numpy.dot(contexts, beta), sigma),
obs=goals)
Clearly I can change the dimensionality of the regression myself outside the call and track the variable indices, but I was hoping there might be a more elegant way of doing this. many thanks.