The problems with PSIS diagnostic to test SVI

I want to check the convergence of SVI. I find the psis diagnostic but don’t know how to use it. The SVI can run correctly before using psis. Who knows other methods to test the SVI?

from pyro.infer.importance import  psis_diagnostic
fk=psis_diagnostic(model=model,guide=guide)
k=f(x,y)
print(k)

The errors are as follows:
TypeError: forward() missing 1 required positional argument: ‘x’

Hi @everli,
It would help me to see your model and guide (or sketches of them). Without seeing them in detail, I’m guessing you have a model(x, y) and guide(x, y)? In that case, I believe you’ll need to pass the x,y to psos_diagnostic as in

k = psis_diagnostic(model, guide, x, y)
print(k)

If that doesn’t work, could you please provide more information on model,guide?

Hi @fritzo,
I used the Pyro’s example code. The code is as follows:

import os
from functools import partial
import torch
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import pyro
import pyro.distributions as dist
import timeit
start=timeit.default_timer()

# for CI testing
smoke_test = ('CI' in os.environ)
#assert pyro.__version__.startswith('1.2.1')
pyro.enable_validation(True)
pyro.set_rng_seed(1)
pyro.enable_validation(True)
# Set matplotlib settings

plt.style.use('default')

DATA_URL = "https://d2hg8soec8ck9v.cloudfront.net/datasets/rugged_data.csv"
data = pd.read_csv(DATA_URL, encoding="ISO-8859-1")
df = data[["cont_africa", "rugged", "rgdppc_2000"]]
df = df[np.isfinite(df.rgdppc_2000)]
df["rgdppc_2000"] = np.log(df["rgdppc_2000"])

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 6), sharey=True)
african_nations = df[df["cont_africa"] == 1]
non_african_nations = df[df["cont_africa"] == 0]
 
from torch import nn
from pyro.nn import PyroModule

assert issubclass(PyroModule[nn.Linear], nn.Linear)
assert issubclass(PyroModule[nn.Linear], PyroModule)

from pyro.nn import PyroSample

class BayesianRegression(PyroModule):
    def __init__(self, in_features, out_features):
        super().__init__()
        self.linear = PyroModule[nn.Linear](in_features, out_features)
        #ww=dist.Normal(0., 1.).expand([out_features, in_features]).to_event(2)
        self.linear.weight = PyroSample(dist.Normal(0., 1.).expand([out_features, in_features]).to_event(2))
        self.linear.bias = PyroSample(dist.Normal(0., 5.).expand([out_features]).to_event(1))

    def forward(self, x, y=None):         
        sigma = pyro.sample("sigma", dist.HalfNormal(1.))
        mean = self.linear(x).squeeze(-1)
        with pyro.plate("data", x.shape[0]):
            obs = pyro.sample("obs", dist.Normal(mean, sigma), obs=y)
        return mean

from pyro.infer.autoguide import AutoDiagonalNormal

# Dataset: Add a feature to capture the interaction between "cont_africa" and "rugged"
df["cont_africa_x_rugged"] = df["cont_africa"] * df["rugged"]
data = torch.tensor(df[["cont_africa", "rugged", "cont_africa_x_rugged", "rgdppc_2000"]].values,
                        dtype=torch.float)
x_data, y_data = data[:, :-1], data[:, -1]

# Regression model
#linear_reg_model = PyroModule[nn.Linear](3, 1)

model = BayesianRegression(3, 1)
guide = AutoDiagonalNormal(model)

#########################
#SVI
from pyro.distributions import Normal
from pyro.optim import Adam
from pyro.infer import SVI,Predictive
from pyro.infer import Trace_ELBO,TraceEnum_ELBO, config_enumerate
from pyro.infer.importance import  psis_diagnostic


optim = Adam({"lr": 0.01})

svi = SVI(model, guide, optim, loss=Trace_ELBO())
loss = 0


# optimizer = torch.optim.SGD(lstm.parameters(), lr=learning_rate)
pyro.clear_param_store()

# Train the model
for iteration in range(2000):

    loss=svi.step(x_data,y_data)
    #print("Epoch ", epoch, " Loss ", total_epoch_loss_train)
    if iteration % 100 == 0:
        print("Epoch: %d, loss: %1.5f" % (iteration,loss))

k=psis_diagnostic(model,guide,x_data,y_data)

The errors:
RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 10D
Trace Shapes:
Param Sites:
Sample Sites:
num_particles_vectorized dist |
value 1000 |
sigma dist 1000 1 1 1 1 1 1 1 |
value 1000 1 1 1 1 1 1 1 |
linear.weight dist 1000 1 1 1 1 1 1 1 | 1 3
value 1000 1 1 1 1 1 1 1 | 1 3
linear.bias dist 1000 1 1 1 1 1 1 1 | 1
value 1000 1 1 1 1 1 1 1 | 1

Hi, do you find the problems?

i think the trouble here is that nn.Linear expects the weight matrix to be 2 dimensional (ret = torch.addmm(bias, input, weight.t())). but psis is drawing large bags of samples and so the weight matrix becomes 3+ dimensional. basically you need to replace nn.Linear with a custom module that can handle “batched” weights

Thank you! You are right. nn.Linear is a simple model and we replace it with custom module by rewriting one. However, when the model refers to nn.GRU or other complex models, how to solve it? I am not sure I understand your point. May you explain it in detail? Thank you again.

i am not sure how you might solve it in general. pytorch modules were not designed with this in mind. in any case doing something like variational inference with neural networks with many parameters (like a GRU) generally doesn’t work very well in any case.

Ok, I get it. Thanks.