Inference on test data - simple Bayesian regression

It’s my first pyro example code (I apologize if the question is oversimple) and I have difficulties in trying out the model with a fresh test sample.


I am working with NCAA® basketball games played between Division I women’s teams in 2017.


I am assuming that each team has a power score coming from a ~N(0,1) distribution and model the scores between home and away teams according to a Normal distribution with ~N(\text{home team power}-\text{away team power}, 10)

def normal_winner_model(home_index, away_index, score):
    num_teams = teams_2017.shape[0]
    num_games = home.shape[0]
    mu = torch.zeros(num_teams ,1)
    sigma = torch.ones(num_teams ,1)
    prior_power = pyro.sample("prior_power", Normal(mu ,sigma))
    maskhome = torch.zeros(num_teams ,num_games ,dtype=torch.float)\
        .scatter_(0, home_index[None,:] , 1.)
    maskaway = torch.zeros(num_teams ,num_games ,dtype=torch.float)\
        .scatter_(0, away_index[None,:] , 1.)
    score_mu = prior_power.transpose(0,1).matmul(maskhome - maskaway).squeeze()
    with pyro.plate("data"):
        pyro.sample("score",Normal(score_mu,10),obs = score)


I am using a MultivariateNormal guide

guide = AutoMultivariateNormal(normal_winner_model)

and perform SVI inference with Adam optimizer

Testing the model

As I understand I need to replay the guide's prior_power parameter and feed the frozen model with the test data

preds = []
for _ in range(1000):
    guide_trace = pyro.poutine.trace(guide).get_trace(hloc_test, aloc_test, None)
    # assuming that the original model took in data as (x1, x2, y) where y is observed
    lifted_model = pyro.poutine.replay(normal_winner_model, guide_trace)
    preds.append(lifted_reg_model(hloc_test, aloc_test, None))

however all preds are None. What do I miss here?


Hi @noam, I think that you are in the right track of using low level API poutine for predictions. I would like to explain what each line of code does so it will be easier for you to figure what is missing here:


lifted_model = pyro.poutine.replay(normal_winner_model, guide_trace)

will rewrite the normal_winner_model with the effect is that: the returned value of each sample statement is obtained from the trace guide_trace.

When you call lifted_model with inputs (hloc_test, aloc_test, None), it will run the stochastic function lifted_model with replay effect as above. Because there is no return statement in normal_winner_model, it will return None. That is the reason you get all preds are None.

What we can do here is to get traces when applying lifted_model with testing inputs (I hope that it will play nicely with pyro.plate("data") statement.) For example,

pred_trace = poutine.trace(lifted_model).get_trace(hloc_test, aloc_test, None)

Instead of using low level API as above, you can use TracePosterior or TracePredictive (but those utilities are under refactoring process for the next Pyro release).


sorry for the late response…
Thanks, worked like a charm!