Calculating log_likelihood for model with scan

julianstastny · August 7, 2021, 11:31am

Yeah, treating y_i as a latent variable is what I want to do. But I can’t figure out how to do it in practice with scan. Maybe my issue becomes clearer if I say what I tried:

Given some model

def model(X, y=None):
    # Model where y_i depends on x_i and y_{i-1}
    ...
    def transition(y_prev, data_curr):
        x_curr, y_curr = data_curr
        probs = foo(y_prev, x_curr)
        obs = numpyro.sample(
            "y", dist.Bernoulli(probs=probs, obs=y_curr
        )
        return obs, (obs)
    _, (obs) = scan(transition, (init_y), (X, y), length=len(y))

my first idea was that I can fit model with y such that y[i] = None. For that, y can’t be a numpy array but has to be a list as far as I can tell. But it seems like numpyro or scan can’t handle y to be a Python list.

I then thought that I can keep y as a numpy array but set y[i] = -1, then have a control flow inside scan which essentially says y_curr = None if y_curr == -1. But it seems like regular Python control flow isn’t allowed inside scan, and

y_curr = cond(
    y_curr != -1,
    lambda _: numpyro.deterministic('y_curr', y_curr),
    lambda _: numpyro.deterministic('y_curr', None),
    None,
)

also doesn’t seem to work.

I guess there ought to be a simple and embarrassing solution, but I can’t think of any.

Maybe I also don’t understand what you mean with putting an improper prior on y_i.

[Edit: Changed example model to reflect the issue outlined in the comment below]