Forecasting with a stochastic volatility model in numpyro

clgarciga · October 17, 2021, 6:46pm

Hello,

I’m trying to forecast with the following model. First I tried:

def scan_fn(carry, error):
    sigma, coeff_prev = carry
    coeff = coeff_prev + sigma*error
    coeff_prev = coeff
    
    return (sigma, coeff_prev), coeff

def weights_regression_sv(X=None, y=None):
    
    T, K = X.shape
    
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    __, log_lambda_t   = scan(scan_fn, (phi, log_lambda_0), lambda_errors, T)
    log_lambdas        = numpyro.deterministic("log_lambdas",  log_lambda_t)
    lambdas            = numpyro.deterministic("lambdas",  jnp.exp(log_lambda_t))
    
    return numpyro.sample("y",dist.Normal(jnp.matmul(X,beta),jnp.sqrt(lambdas)),obs=y)

and used

predictive = Predictive(weights_regression_sv, posterior_samples=m_sv.get_samples())
y_pred = predictive(random.PRNGKey(0), X=X_[[-1],:])["y"]
az.plot_kde(100*y_pred)
plt.show();

but got the error:

Then I tried:

def scan_fn(carry, t):
    phi, log_lambda_prev, xbeta, lambda_errors = carry
    
    log_lambda = log_lambda_prev + phi*lambda_errors[t]
    lambda_ = jnp.exp(log_lambda)
    
    y_ = numpyro.sample("y", dist.Normal(xbeta[t],jnp.sqrt(lambda_)))
    
    log_lambda_prev = log_lambda
    return (phi, log_lambda_prev, xbeta, lambda_errors), y_

def weights_regression_sv_fcast(X=None, y=None, forecast=None):
    
    T, K = X.shape
    
    if forecast:
        T_ = T+1
    else:
        T_ = T
    
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T_)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    
    with numpyro.handlers.condition(data={"y": y}):
        xbeta = jnp.matmul(X,beta)
        __, ys   = scan(scan_fn, (phi, log_lambda_0, xbeta, lambda_errors), jnp.arange(0,T_))
    if forecast:
        numpyro.deterministic("y_forecast", ys[-1])

But got this error during model estimation:

I was working off Time Series Forecasting — NumPyro documentation, but this doesn’t really seem applicable given the use of X variables? Also I looked at Forecasting with Dynamic Linear Model (DLM) — Pyro Tutorials 1.7.0 documentation but this also doesn’t seem applicable given the absence of the Forecast module in numpyro?

clgarciga · October 18, 2021, 3:53am

Update:

This worked:

def scan_fn(carry, error):
    sigma, coeff_prev = carry
    coeff = coeff_prev + sigma*error
    coeff_prev = coeff
    
    return (sigma, coeff_prev), coeff

def weights_regression_sv_fcast(X=None, y=None, forecast=0):
    
    T, K = X.shape
    
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    __, log_lambda_t   = scan(scan_fn, (phi, log_lambda_0), lambda_errors, T)
    log_lambdas        = numpyro.deterministic("log_lambdas",  log_lambda_t)
    lambdas            = numpyro.deterministic("lambdas",  jnp.exp(log_lambda_t))
    
    T_END = T-forecast
    numpyro.sample("y_obs",dist.Normal(jnp.matmul(X[:T_END],beta),jnp.sqrt(lambdas[:T_END])),obs=y)
    numpyro.sample("y_pred",dist.Normal(jnp.matmul(X[T_END:],beta),jnp.sqrt(lambdas[T_END:])),obs=None)

Is getting forecasts this way meaningfully different from getting them using

predictive = Predictive(weights_regression_sv, posterior_samples=m_sv.get_samples())
y_pred = predictive(random.PRNGKey(0), X=X_[[-1],:])["y"]

?

P.S.: Thank you to Using lax.scan for time series in NumPyro - numpyro - Pyro Discussion Forum

fehiepsi · October 18, 2021, 12:17pm

I think you lambda_errors is a time-local variable (i.e. a latent variable that depends on time). The posterior will give you the samples of lambda_errors for all of your time steps. You need to resolve that logic in your prediction, where you seem to only need samples from the last step.

clgarciga · October 18, 2021, 4:01pm

Thanks! Could you elaborate on what you mean when you say I need to resolve that logic?

fehiepsi · October 18, 2021, 6:03pm

In the model, you defined T, K = X.shape. For prediction, T=1, so your latent variable

lambda_errors = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T)))

needs to have shape (1,). But your posterior gives lambda_errors with shape T != 1 because you were running MCMC on the data X with X.shape[0] != 1.

For time-local variables, it is better to move them to scan body_fn, like in gaussian_hmm example. But I’m not sure what you want to predict, so the first step might be to reconsider how you construct your forecasting model.

clgarciga · October 19, 2021, 12:49am

Mmmm, sorry, I’m still confused. Let me reset and try this:

My model is:

y_t = \beta x_t + \lambda_t^{0.5} \epsilon_t where \epsilon_t \sim N(0,1)
\log(\lambda_t) = \log(\lambda_{t-1}) + \phi v_t where v_t \sim N(0,1)

So it follows (I think) that to forecast y_{T+1} given x_{T+1}, I would need (\lambda_1,...,\lambda_T, \lambda_{T+1}). I implement the entire model as follows (note that lambda_errors=v_{1:T}):

def scan_fn(carry, error):
    sigma, coeff_prev = carry
    coeff = coeff_prev + sigma*error
    coeff_prev = coeff
    
    return (sigma, coeff_prev), coeff

def weights_regression_sv_fcast(X=None, y=None, X_forecast=None):
    
    T, K = X.shape
    
    if X_forecast is not None:
        T_ = T+X_forecast.shape[0]
    else:
        T_ = T
    
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T_)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    __, log_lambda_t   = scan(scan_fn, (phi, log_lambda_0), lambda_errors, T_)
    log_lambdas        = numpyro.deterministic("log_lambdas",  log_lambda_t)
    lambdas            = numpyro.deterministic("lambdas",  jnp.exp(log_lambda_t))
    
    numpyro.sample("y_obs",dist.Normal(jnp.matmul(X,beta),jnp.sqrt(lambdas[:T])),obs=y)
    if X_forecast is not None:
        numpyro.sample("y_pred",dist.Normal(jnp.matmul(X_forecast,beta),jnp.sqrt(lambdas[T:])),obs=None)

My goal is to do 1-step-ahead forecasts but this code should work for h-step-ahead for any positive integer h. But, let’s assume I just want a 1-step-ahead. So, I need X (containing x_1,...,x_T), y containing y_1,...,y_T, and then an x_{T+1}. Given that, for a run of the code above, I should get (\lambda_1,...,\lambda_{T+1}) from:

    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T_)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    __, log_lambda_t   = scan(scan_fn, (phi, log_lambda_0), lambda_errors, T_)
    log_lambdas        = numpyro.deterministic("log_lambdas",  log_lambda_t)
    lambdas            = numpyro.deterministic("lambdas",  jnp.exp(log_lambda_t))

And then I plug in (\lambda_1,...,\lambda_{T}) into the likelihood and save \lambda_{T+1} to generate my forecast in:

numpyro.sample("y_pred",dist.Normal(jnp.matmul(X_forecast,beta),jnp.sqrt(lambdas[T:])),obs=None)

For example:

X_train = X_[:(len(X_)-1)]
y_train = y_[:(len(X_)-1)]
X_test = X_[[-1]]
m_sv_fcast.run(random.PRNGKey(0),X=X_train,y=y_train,X_forecast=X_test)
m_sv_fcast.print_summary(0.9,exclude_deterministic=False)

So my questions would be:

Does this look right?
If it does look right, what’s the difference between this and excluding the “y_pred” line, calculating only T lambdas, and plugging in the X_forecast into a call to predictive
If it looks wrong, then (how) can I fix it?
And, to your earlier comment, is it better to just sample v_t \sim N(0,1) in the scan body_fn?

fehiepsi · October 19, 2021, 2:02am

The point is your lambda error has length T but when it is used for prediction, it has length T_.

To fix the issue, you should move lambda error sample statement to scan function, like in the gaussian hmm example. Otherwise, you need to sample something like lambda_error_of_forecast, then concatenate with lambda_error_posterior_samples. Both are equivalent in my opinion.

clgarciga · October 19, 2021, 4:23am

I think I understand? How’s this:

def scan_fn(carry, error):
    sigma, coeff_prev = carry
    coeff = coeff_prev + sigma*error
    coeff_prev = coeff
    
    return (sigma, coeff_prev), coeff

def weights_regression_sv_fcast_3(X=None, y=None, X_forecast=None):
    
    T, K = X.shape
    
    # Beta
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    lambda_errors      = numpyro.sample('lambda_errors', dist.Normal(0.,jnp.ones(T)))
    log_lambda_0       = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                = numpyro.sample('phi', dist.InverseGamma(5,.15))
    __, log_lambda_t   = scan(scan_fn, (phi, log_lambda_0), lambda_errors, T)
    log_lambdas        = numpyro.deterministic("log_lambdas",  log_lambda_t)
    lambdas            = numpyro.deterministic("lambdas",  jnp.exp(log_lambda_t))
    
    # Likelihood
    numpyro.sample("y_obs",dist.Normal(jnp.matmul(X,beta),jnp.sqrt(lambdas)),obs=y)
    
    # Forecasts
    if X_forecast is not None:
        T_f               = X_forecast.shape[0]
        lambda_errors_f   = numpyro.sample('lambda_errors_f', dist.Normal(0.,jnp.ones(T_f)))
        __, log_lambdas_f = scan(scan_fn, (phi, log_lambdas[-1]), lambda_errors_f, T)
        lambdas_f         = numpyro.deterministic("lambdas_f",  jnp.exp(log_lambdas_f))
        numpyro.sample("y_pred",dist.Normal(jnp.matmul(X_forecast,beta),jnp.sqrt(lambdas_f)),obs=None)

I think part of my confusion was that for numpyro there aren’t any examples like this; the forecasting examples rely on a single data series y from which you can generate iterative forecasts, like you can always do \hat{y}_{T+1} = \beta y_{T} + \epsilon_T using the posterior and then keep iterating forward to get \hat{y}_{T+2} =\beta \hat{y}_{T+1}+ \epsilon_T and so on.

(Assuming this is right, would it be useful example to contribute? I’ll connect it to a paper.)

fehiepsi · October 19, 2021, 12:52pm

Yeah, having an example would be great. But I would advocate moving all time-local latent sample statement to scan body function. That’s what scan primitive used for. Otherwise, you can just use jax.lax.scan, which will require more engineering hack (like drawing a different latent site, then concatenating at correct dimensions) when dealing with more complicated models. Why not just (using the same pattern as in gaussian hmm example):

def scan_fn(carry, x):
    del x
    error = numpyro.sample('lambda_error', dist.Normal(0, 1))
    ...
    return (sigma, coeff_prev), coeff

clgarciga · October 19, 2021, 3:10pm

I understand what you’re saying…I was trying to avoid it because I didn’t know what argument to use for xs in that case. But yeah, I guess I can just delete it. Or, would the GaussianRandomWalk dist work in this case maybe?

fehiepsi · October 19, 2021, 3:59pm

I think so (see stochastic volatility example). Its computation is vectorized, so the sampling might be faster. For forecasting, you will need to draw an additional GaussianRandomWalk like your current approach.

clgarciga · October 19, 2021, 5:02pm

Hmmm, actually I don’t think it’ll work because GaussianRandomWalk doesn’t except an initial value. Sounds like an Github issue…

fehiepsi · October 19, 2021, 11:37pm

You can just add the init value to the gaussian random walk outputs. By definition, “random walk” pattern should not depend on the init value.

clgarciga · October 20, 2021, 12:10am

You can just add the init value to the gaussian random walk outputs.

You’re right. Whoops!

By definition, “random walk” pattern should not depend on the init value.

True, but papers I’ve seen usually make a big deal about putting a prior on the initial value. I’m sure if it matters or not, but others seem to think so.

clgarciga · October 20, 2021, 12:25am

Voila:

def weights_regression_sv_fcast_2(X=None, y=None, X_forecast=None):
    
    T, K = X.shape
    
    # Beta
    beta_unweighted   = numpyro.sample('beta_unweighted', dist.HalfNormal(jnp.ones(K)))
    beta              = numpyro.deterministic("beta",  beta_unweighted/jnp.sum(beta_unweighted))
    
    # SV
    log_lambda_0           = numpyro.sample('log_lambda_0', dist.Normal(0.,4))
    phi                    = numpyro.sample('phi', dist.InverseGamma(5,.15))
    log_lambdas_ex_init    = numpyro.sample("log_lambdas_ex_init",  dist.GaussianRandomWalk(scale=phi,num_steps=T))
    log_lambdas            = numpyro.deterministic("log_lambdas",  log_lambdas_ex_init+log_lambda_0)
    lambdas                = numpyro.deterministic("lambdas",  jnp.exp(log_lambdas))
    
    # Likelihood
    numpyro.sample("y_obs",dist.Normal(jnp.matmul(X,beta),jnp.sqrt(lambdas)),obs=y)
    
    # Forecasts
    if X_forecast is not None:
        T_f                    = X_forecast.shape[0]
        log_lambdas_ex_init_f  = numpyro.sample("log_lambdas_ex_init_f",  dist.GaussianRandomWalk(scale=phi,num_steps=T_f))
        log_lambdas_f          = numpyro.deterministic("log_lambdas_f",  jnp.exp(log_lambdas_ex_init_f+log_lambdas[-1]))
        lambdas_f              = numpyro.deterministic("lambdas_f",  jnp.exp(log_lambdas_f))

        numpyro.sample("y_pred",dist.Normal(jnp.matmul(X_forecast,beta),jnp.sqrt(lambdas_f)),obs=None)

Thanks for bearing with me

fehiepsi · October 20, 2021, 12:59am

Glad it works! The prior for init value is used because there is no guarantee that the random walk starts at zero. I should put “pattern” in a quote.

djchester · March 26, 2024, 6:46pm

Hello! I just wanted to add a link here for a new, but very-much related, topic if anyone in this thread might be interested. I could be wrong, but I linked a google colab notebook (in my post) that displays scan as providing backward smoothing behavior in the parameters through time. This is of course OK, if this of interest, but raises issues of look-ahead bias for forecasts… any and all feedback appreciated!