 # State-space model: is lax.scan compatible with numpyro.sample?

I wanted to test out coding a state-space model in numpyro using lax.scan. I’m running into issues that make me suspect this is not supported — or perhaps I’m just getting something else wrong! Here’s my model:

``````def target(T=10, q=1, r=1, phi=0., beta=0.):

def transition(state, i):
x0, mu0 = state
x1 = numpyro.sample(f'x_{i}', dist.Normal(phi*x0, q))
mu1 = beta * mu0 + x1
y1 = numpyro.sample(f'y_{i}', dist.Normal(mu1, r))
return (x1, mu1), (x1, y1)

mu0 = x0 = numpyro.sample('x_0', dist.Normal(0, q))
y0 = numpyro.sample('y_0', dist.Normal(mu0, r))

_, xy = jax.lax.scan(transition, (x0, mu0), np.arange(1, T))
x, y = xy

return np.append(x0, x), np.append(y0, y)
``````

This returns:

``````x [-1.1470195 -0.3285517 -0.3285517 -0.3285517 -0.3285517 -0.3285517
-0.3285517 -0.3285517 -0.3285517 -0.3285517]
y [-2.2391834   0.32762653  0.32762653  0.32762653  0.32762653  0.32762653
0.32762653  0.32762653  0.32762653  0.32762653]
``````

It appears the sample statements within `transition` only generate one random value, which is repeated in each iteration. When I try to use this model within `Predictive`, I get an error:

``````prior = Predictive(target, posterior_samples = {}, num_samples = 10)
prior_samples = prior(PRNGKey(2), T=10)
``````
``````UnexpectedTracerError: Encountered an unexpected tracer. Perhaps this tracer escaped through global state from a previously traced function.
The functions being transformed should not save traced values to global state.
Details: Can't lift level Traced<ShapedArray(float32[]):JaxprTrace(level=1/0)> to JaxprTrace(level=0/0).
``````

I don’t need to get this model running, and understand that I could reparameterize it to generate all of the random variables outside the loop. I’m just wondering about more general state-space models with transitions that are not as easily re-parameterized: can one put sampling statements within a loop that is executed by lax.scan?

Thanks!

Update: I checked that this behaves as expected if I replace `jax.lax.scan` with this reference implementation (slightly edited from the jax docs to handle nested containers):

``````def scan(f, init, xs, length=None):
if xs is None:
xs = [None] * length
carry = init
ys = []
for x in xs:
carry, y = f(carry, x)
y_flat, y_tree = tree_flatten(y)
ys.append(y_flat)
ys_stacked = [np.stack([y[i] for y in ys]) for i in range(len(y_flat))]
return carry, tree_unflatten(y_tree, ys_stacked)
``````

This gives:

``````x [-1.1470195  -0.3285517  -1.1249288   0.7978287   2.298611    1.3821741
-0.71665144 -1.2928588   0.54819393 -0.929248  ]
y [-2.2391834   0.32762653 -0.07294703  0.45133084  0.5909368   2.4635558
0.39577067 -1.1304063   0.41633856 -0.15525198]
``````

So I guess this confirms that numypro.sample and jax.lax.scan are not compatible. (I can certainly understand this given the complexity involved!) Does that seem right? Also, if correct: I find this a very elegant way to write a time-series model — would there be a chance of this being supported in the future?

Hi @sheldon, currently they are not composable like you expected. It seems to be complicated to support the general pattern but something like pyro.infer.reparam can work here. Basically, your program can be reparameterized (which is very helpful for inference algorithms) as

``````x_noise = numpyro.sample('x_noise', dist.Normal(0, 1), sample_shape=(T - 1,))
y_noise = numpyro.sample('y_noise', dist.Normal(0, 1), sample_shape=(T - 1,))
# then use x_noise, y_noise in transition, e.g.
#    x1 = phi * x0 + q * x_noise[i]
``````

What do you think?

In case you want to support those common patterns, could you open an issue in github so that we can follow up this FR after porting `pyro.infer.reparam` to NumPyro? Thanks!

@neerajprad what do you think about having something like `reparam.ScanReparam(transition, length, {"x": LocScaleReparam, "y": LocScaleReparam})`? I just have a vague idea that it will work.

@sheldon - Unfortunately, numpyro primitives like `sample` have side effects which need to be captured by the tracer for the programs to work correctly (in this case `lax.scan` doesn’t have that visibility). Parameterizing as suggested by @fehiepsi will work best. We should highlight these gotchas in our README.

what do you think about having something like `reparam.ScanReparam(transition, length, {"x": LocScaleReparam, "y": LocScaleReparam})` ? I just have a vague idea that it will work.

Do you think that it is a general enough solution? We can open up an issue and discuss if this is a common pattern that we’d like to support. An alternative is to highlight these use cases through specific examples that users can use as templates for their models.

Hi @fehiepsi and @neerajprad. Thanks for the responses! Reparameterization seems like a good solution, at least for many models. As a user, it’s very clear how to do it for this model, but I’m not sure it would be completely obvious to me for a more complex model. If there are routines to reparameterize automatically, when possible, that sounds like a nice feature.

I was actually thinking about this from the perspective of inference algorithms and not any specific model. The nice thing about the lax.scan design pattern is that it explicitly reveals the sequential nature of the model, and it seems like it would be a short leap from a model described in that format to something like SMC for that model. That would seem harder if noise is pulled out of the loop.

Understood! I agree that it is less engineering and the model looks more intuitive if we do reparameterize automatically. Looking like we can support this in the future.

@neerajprad Yes, I think it will work (though complicated) for all reparameterized sites. Something like

``````def scan_reparam(transition, carry, xs):
# inspect latent sites by running the first step of `transition(carry)`
# replace `sample(site, dist)` statements
# by `sample(site, noise_dist, sample_shape=(len(xs),))`,
# and store the results in a dict `site_values`

def new_transition(carry, x_):
i, x = xs_
# use effect handler `reparam_transition_fn` for `transition`
# to make `sample(site, ...)` returns `loc + scale * site_values[site][i]`
noises =  {site: values[i] for site, value in site_values}
return block(reparam_given_noise(transform_fn, noises))(carry, x)

# run scan for the remaining steps (I use the same
# xs and carry here for simplicity)
return lax.scan(new_transition, carry, (np.arange(len(xs)), xs))
``````

While writing the sketch, I feel more confident that it will work. WDYT?

@sheldon - Reparameterization will more generally make it easier for the NUTS sampler to sample from this model due to non-centering and should be much faster.

If you would like to use `lax.scan` in other models where it might not be easy to reparameterize, you will need to pass in the `PRNGKey` explicitly to the scan’s body function, otherwise the same source of randomness will be used each time. This is just a limitation of how numpyro’s effect handlers that carry state like `numpyro.seed` interacts with JAX’s transformations (which need the functions to be deterministic functions of the input parameters). The workaround should be simple - you just need to pass in the rng key explicitly and use that in your sample statements.

``````def target(T=10, q=1, r=1, phi=0., beta=0.):
def transition(state, xs):
i, key = xs
# different keys for the two sample statements
key1, key2 = random.split(key)
x0, mu0 = state
x1 = numpyro.sample(f'x_{i}', dist.Normal(phi * x0, q), rng_key=key1)
mu1 = beta * mu0 + x1
y1 = numpyro.sample(f'y_{i}', dist.Normal(mu1, r), rng_key=key2)
return (x1, mu1), (x1, y1)

mu0 = x0 = numpyro.sample('x_0', dist.Normal(0, q))
y0 = numpyro.sample('y_0', dist.Normal(mu0, r))

# Sample a rng_key and pass it to `scan`
rng_key = numpyro.sample('key', dist.PRNGIdentity())
_, xy = jax.lax.scan(transition, (x0, mu0), (np.arange(1, T), random.split(rng_key, T-1)))
x, y = xy

return np.append(x0, x), np.append(y0, y)
``````

The only change wrt to your snippet is that we are passing and using explicit rng keys in the scanned function. Does that work for your use case?

@fehiepsi - This seems like an interesting utility function. So the basic idea is that if all the distributions inside `transition` are reparameterizable as loc-scale, we should be able to sample from the noise distribution beforehand and make the function a deterministic one? Once we have the reparameterizers in numpyro, this will be an interesting use case. Hi @neerajprad. Yes, this makes complete sense, thanks! I had considered a solution like this where the key is passed around as part of the state:

``````def transition(state, i, q=1., r=1., phi=0.5, beta=0.5):
x0, mu0, key = state
key, subkey1, subkey2 = jax.random.split(key, 3)
x1 = numpyro.sample(f'x_{i}', dist.Normal(phi*x0, q), rng_key=subkey1)
mu1 = beta * mu0 + x1
y1 = numpyro.sample(f'y_{i}', dist.Normal(mu1, r), rng_key=subkey2)
return (x1, mu1, key), (x1, y1)
``````

but I didn’t know how to connect this kind of explicit key handling to the numpyro handlers (i.e. `PRNGIdentity()` distribution and the `rng_key` argument to numpyro.sample). This works perfectly, thanks!

Hi @neerajprad. Oops, with your workaround I can now generate from the model by executing `target` with the `seed` handler. But when I try to use the `Predictive` distribution it fails. I’m not sure if you were expecting that to work. Here’s my code:

``````def target(T=10, q=1., r=1., phi=0.5, beta=0.5):

def transition(state, xs):
i, key = xs
key1, key2 = jax.random.split(key)
x0, mu0 = state
x1 = numpyro.sample(f'x_{i}', dist.Normal(phi * x0, q), rng_key=key1)
mu1 = beta * mu0 + x1
y1 = numpyro.sample(f'y_{i}', dist.Normal(mu1, r), rng_key=key2)
return (x1, mu1), (x1, y1)

mu0 = x0 = numpyro.sample('x_0', dist.Normal(0, q))
y0 = numpyro.sample('y_0', dist.Normal(mu0, r))

key = numpyro.sample('key', dist.PRNGIdentity())
_, xy = jax.lax.scan(transition, (x0, mu0), (np.arange(1, T), jax.random.split(key, T-1)))
x, y = xy

return np.append(x0, x), np.append(y0, y)

prior = Predictive(target, posterior_samples = {}, num_samples = 10)
prior_samples = prior(PRNGKey(2), T=10)
``````

And the result

``````UnexpectedTracerError: Encountered an unexpected tracer. Perhaps this tracer escaped through global state from a previously traced function.
The functions being transformed should not save traced values to global state.
Details: Can't lift level Traced<ShapedArray(float32[]):JaxprTrace(level=1/0)> to JaxprTrace(level=0/0).
``````

Very interesting, I didn’t expect that but this isn’t a pattern that we have used in the past, so I’m happy to see these bugs getting percolated up and getting fixed. I have filed an issue (https://github.com/pyro-ppl/numpyro/issues/566) and will be posting a follow up on that.