MCMC giving strange results for logistic regression example

Hey there, I’m completely new to probabilistic programming, so it may very well be that I made an obvious mistake.

Now to my problem. After reading through the tutorial, Pyro forum and some of my lecture notes, I wanted to implement a logistic regression model based on the bayesian regression tutorial. I finally ended up with the following:

class LogisticRegressionModel(nn.Module):
    def __init__(self, n_features):
        self.linear = nn.Linear(n_features, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        return self.sigmoid(self.linear(x))

def model(x, y):
    n_features = x.shape[-1]
    n_samples = x.shape[-2]
    loc, scale = torch.zeros(1, n_features),  torch.ones(1, n_features)
    bias_loc, bias_scale = torch.zeros(1), 50 * torch.ones(1)
    w_prior = Normal(loc, scale)
    b_prior = Normal(bias_loc, bias_scale)
    priors = {'linear.weight': w_prior, 'linear.bias': b_prior}
    lifted_module = pyro.random_module("module", regression_model, priors)
    lifted_reg_model = lifted_module()
    with pyro.plate("map", n_samples):
        model_logits = lifted_reg_model(x).squeeze(-1)
        pyro.sample("obs", Bernoulli(logits=model_logits, validate_args=True), obs=y)

When I now estimate the latent variables through SVI:

optim = Adam({"lr": 0.01})
svi = SVI(model, guide, optim, loss=Trace_ELBO())
regression_model = LogisticRegressionModel(X_train_t.shape[-1])

num_iterations = 10000
for j in range(num_iterations):
    loss = svi.step(X_train_t, y_train_t)
    if j % (num_iterations / 10) == 0:
        print("[iteration %04d] loss: %.4f" % (j + 1, loss / float(X_train_t.shape[-2])))

With a guide that I found on forum here and an automatically generated guide, I get reasonable results, however still considerably worse, when compared to a regular logistic regression model:

def guide(x, y):
    n_features = 29
    w_loc = torch.zeros(1, n_features)
    w_log_sig = -3.0 * torch.ones(1, n_features) + 0.05 * torch.randn(1, n_features)
    b_loc = torch.zeros(1)
    b_log_sig = -3.0 * torch.ones(1) + 0.05 * torch.randn(1)
    mw_param = pyro.param("guide_mean_weight", w_loc)
    sw_param = softplus(pyro.param("guide_log_scale_weight", w_log_sig))
    mb_param = pyro.param("guide_mean_bias", b_loc)
    sb_param = softplus(pyro.param("guide_log_scale_bias", b_log_sig))
    w_dist = Normal(mw_param, sw_param).independent(1)
    b_dist = Normal(mb_param, sb_param).independent(1)
    dists = {'linear.weight': w_dist, 'linear.bias': b_dist}
    lifted_module = pyro.random_module("module", regression_model, dists)
    return lifted_module()

from pyro.infer.autoguide import AutoDiagonalNormal
guide = AutoDiagonalNormal(model)

y=[guide(None, None)(X_test_t).data.numpy().reshape(-1) for i in range(1000)]
y_pred = np.mean(y,axis=0)
roc_auc_score(y_true=y_test, y_score=y_pred)

I then tried use MCMC sampling using the NUTS kernel:

init_params, potential_fn, transforms, _ = initialize_model(model, model_args=(X_train_t, y_train_t),
nuts_kernel = NUTS(potential_fn=potential_fn)
mcmc = MCMC(nuts_kernel,
            transforms=transforms), y_train_t)

y=predictive(model, mcmc.get_samples(1), X_test_t, None)
y_pred = y["obs"].mean(axis=0).numpy()

roc_auc_score(y_true=y_test, y_score=y_pred)

When with SVI I got receiver operating characteristic AUC values around .75, the MCMC sampling leads to estimates around .5, thus random. A simple sklearn l2 penalized logistic regression on the same data reaches .84. Now first, there must be a problem somewhere in the MCMC sampling example but I have no clue what it is, could anyone point out the mistake? Second, is it to be expected that a bayesian model has considerably worse performance than its regular frequentist counterpart?

As a final note, a small bug I encountered when initializing multiple chains, it throws the following exception when run in jupyter notebook. I suppose it somehow fails to display the progress bar in this case:

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/home/test/anaconda3/lib/python3.7/", line 917, in _bootstrap_inner
  File "/home/test/anaconda3/lib/python3.7/", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/home/test/anaconda3/lib/python3.7/site-packages/pyro/infer/mcmc/", line 34, in logger_thread
    progress_bars = ProgressBar(warmup_steps, num_samples, disable=disable_progbar, num_bars=num_chains)
  File "/home/test/anaconda3/lib/python3.7/site-packages/pyro/infer/mcmc/", line 69, in __init__
    position=i, file=sys.stderr, disable=disable)
  File "/home/test/anaconda3/lib/python3.7/site-packages/tqdm/", line 201, in __init__
    kwargs['bar_format'] = kwargs['bar_format'].replace('{bar}', '<bar/>')
AttributeError: 'NoneType' object has no attribute 'replace'

Exception ignored in: <function tqdm.__del__ at 0x7fb11d94d598>
Traceback (most recent call last):
  File "/home/test/anaconda3/lib/python3.7/site-packages/tqdm/", line 1039, in __del__
  File "/home/test/anaconda3/lib/python3.7/site-packages/tqdm/", line 240, in close
    super(tqdm_notebook, self).close(*args, **kwargs)
  File "/home/test/anaconda3/lib/python3.7/site-packages/tqdm/", line 1215, in close
    if self.disable:
AttributeError: 'tqdm_notebook' object has no attribute 'disable'

pyro v: 0.4.1
pytorch v: 1.3.0
Python 3.7
Ubuntu 18.04 LTS

@Mew If you use Bernoulli(logits=model_logits) then it is better to skip sigmoid layer of your LogisticRegressionModel. Otherwise, your LogisticRegressionModel will always give you probs > 0.5. I think that’s why you get a bad estimation of around 0.5. Please let me know if you still get the same problem after removing that layer.

Regarding the progress bar, I guess it is tqdm issue. I am using tqdm=4.32.1 and it works fine in my system. Could you provide your tqdm version so I can take a test for it? In case you are using the latest version of tqdm and still getting progress bar error, could you open an issue at Thanks!

Thanks a lot for the quick answer. You are obviously right with the logits, I should’ve thought about it instead of just copying ;), now results look much better and MCMC is equal to SVI, still not quite like the regular LR but I guess that can be explained by suboptimal priors.

Regarding tqdm I have the most recent version 4.36.1, I’ll quickly check your version to make sure it’s not some other dependency problem and open an issue if it isn’t.