Predicting from regression model that depends on T

clgarciga · December 14, 2021, 7:02pm

Hello,

I’m trying to generate predictions from the following model:

def BQR(tau=0.5, a=5, b=.04, sigma_beta_a=3, sigma_beta_b=1, X=None, y=None):
    T, K = X.shape
    
    # Deterministic
    theta            = (1-2*tau)/(tau*(1-tau))
    tau_star_squared = 2/(tau*(1-tau))
    
    # Non-Beta Priors
    sigma = numpyro.sample('sigma', dist.InverseGamma(a,1/b))
    z     = numpyro.sample('z', dist.Exponential((1/sigma)*jnp.ones(T)))
    
    # Beta Priors
    sigma_beta     = numpyro.sample('sigma_beta', dist.InverseGamma(sigma_beta_a,sigma_beta_b))
    unscaled_betas = numpyro.sample("unscaled_betas", dist.Normal(0., jnp.ones(K)))
    beta           = numpyro.deterministic("beta",  sigma_beta * unscaled_betas)
    beta0          = numpyro.sample('beta0', dist.Normal(0., 1.))

    # Likelihood
    mean_function   = beta0+jnp.matmul(X,beta)+theta*z
    stddev_function = jnp.sqrt(tau_star_squared*sigma*z)
    numpyro.sample("y",dist.Normal(mean_function,stddev_function),obs=y)

I use the following code (where model is BQR from above):

mcmc_obj = MCMC(NUTS(BQR,
              max_tree_depth=20,
              target_accept_prob=0.99),
          num_warmup=5000,
          num_samples=10000,
          num_chains=1,
          progress_bar=False)

def forecast_loop_w_predictive(X,y, mcmc_obj, model, fTs):  
        
    for f in fTs:
        t, quantile = f
        # Training Data up to and including t
        y_train, X_train = y.copy().loc[:t].to_numpy(), X.copy().loc[:t].to_numpy()
        # Testing Data for t+1
        y_test, X_test   = y.copy().loc[t:].iloc[1]   , X.copy().loc[t:].iloc[[1]].to_numpy()
        print("Size of X_train =",X_train.shape)

        # Run model
        mcmc_obj.run(random.PRNGKey(0),tau=quantile, X=X_train,y=y_train)

        # Forecast
        predictive     = Predictive(model, posterior_samples=mcmc_obj.get_samples())
        y_pred_samples = predictive(random.PRNGKey(0), X=X_test)["y"]

        return y_pred_samples

where t is a date, X_train is T x K, y_test is a single value, and X_test is 1 x K, where K is the number of independent variables and T is the number of time periods. I expect y_pred_samples to be num_samples x 1 but instead it is num_samples x T.

I think the issue has to do with the fact that BQR depends on T, and even though X_test is 1 x K, it still “remembers” T from X_train.

My question is, how can I modify my BQR model so that when calling Predictive I get back a num_samples x 1 array of predictions for y.

Thank you in advance.

fehiepsi · December 14, 2021, 10:45pm

Your latent z has shape T. MCMC will give you a collection of (num_samples, T) z. What is your expected shape for it when you make prediction?