Toy Example of Bayesian Logistic Regression

colinconwell · October 23, 2018, 3:21am

Pyro seems to be a phenomenal tool, and I’ve been desperate for an opportunity to use it.

Unabashed greenhorn (and frequentist by habit) that I am, though, I have great trouble even getting off the ground.

I am wondering if anyone has a toy example of a Bayesian logistic regression (similar to the Bayesian linear regression tutorial) that they’d be willing to share by way of instruction!

Much obliged!

jpchen · October 23, 2018, 7:32am

logistic regression should be a straightforward extension of the linear regression. instead of observing against a gaussian, you now observe against a bernoulli.

jtw · October 23, 2018, 11:45pm

I was also trying to implement a toy logistic regression before building something more complex. I found it somewhat hard to follow the existing regression tutorial because of the way the code is structured.

I prefer the structure of the VAE tutorial, in which the model and guide are contained within a reusable module and aren’t relying on any global variables located elsewhere in the notebook.

It took a lot of debugging and messing around with dimensions (I still don’t really grok how Pyro does this but I’m slowly learning), but I have a working example here:

gist.github.com

https://gist.github.com/jamestwebber/b93e486e32a33da5bd99a330608a2e9d

pyro_logistic_regression.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy.special as ssp\n",

This file has been truncated. show original

colinconwell · October 24, 2018, 9:27pm

Phenomenal. Thanks so much! Straightforward for most people is most certainly not straightforward for an amateur like me. I’m wondering how easy it would be to transform this into a softmax regression with torch.nn.softmax as the final layer of the model, and what I suppose would be just gaussians over the regression parameters?

jtw · October 25, 2018, 3:55am

Colin asked me a question that raised something else I was wondering: if I wanted to use nn.Linear instead of explicitly creating weight and bias tensors, what is the idiomatic way to introduce priors to the weights of that module?

It seems like the answer is pyro.random_module but I’m confused about where it needs to be called. In __init__? Once in model and then again in guide?

jpchen · October 25, 2018, 8:51pm

It seems like the answer is pyro.random_module but I’m confused about where it needs to be called

please refer to the docs on using random_module. the tutorial shows the usage. in your example, you would establish the architecture in __init__, then in your model and guide, everywhere you have a sample(), you instantiate that distribution and throw them into a dict. the torch.matmul is now running the nn forward so you’d want:

dict of priors = {name_of_param: prior_dist, ...}
nn_dist = random_module(self, dict_of_priors)
nn = nn_dist()  #sample a nn 
model_logits = nn(data)
...

jtw · October 26, 2018, 3:14am

Ah that makes sense. I didn’t realize that random_module(self, ...) would work like that, that’s cool.

colinconwell · October 26, 2018, 6:38pm

I’m attempting to combine the successful elements of James’ code with the base tutorial: even though my suspicion is that the model itself needs a nonlinear nn.Sigmoid layer, I’ve left that out for now. Here’s how I’ve specified the model call (I’ve left the guide the same as in the tutorial) and the data generator (straight from James’ code):

def build_linear_dataset(N, p=1, noise_std=0.01):
    X = np.random.rand(N, p)
    # w = 3
    w = 3 * np.ones(p)
    # b = 1
    y = np.matmul(X, w) + np.repeat(1, N) + np.random.normal(0, noise_std, size=N)
    y = y.reshape(N, 1)
    X, y = torch.tensor(X).type(torch.Tensor), torch.tensor(y).type(torch.Tensor)
    data = torch.cat((X, y), 1)
    assert data.shape == (N, p + 1)
    return data

def model(data):
    # Create unit normal priors over the parameters
    loc, scale = torch.zeros(1, 1), 10 * torch.ones(1, 1)
    bias_loc, bias_scale = torch.zeros(1), 10 * torch.ones(1)
    w_prior = Normal(loc, scale).independent(1)
    b_prior = Normal(bias_loc, bias_scale).independent(1)
    priors = {'linear.weight': w_prior, 'linear.bias': b_prior}
    lifted_module = pyro.random_module("module", regression_model, priors)
    lifted_reg_model = lifted_module()
    with pyro.iarange("map", N):
        x_data = data[:, :-1]
        y_data = data[:, -1]
        
        model_logits = lifted_reg_model(x_data)
        
        #model_logits = (torch.matmul(x, weights_prior.permute(1, 0)) + bias_prior).squeeze()

        pyro.sample("obs", Bernoulli(logits=model_logits, validate_args=True), obs=y_data.squeeze())

Here’s the error I’m getting when I attempt to run the SVI:

I based this on jpchen’s suggestions. What am I doing wrong?

jpchen · October 26, 2018, 8:51pm

i think model_logits has shape (500, 1) and it is broadcasting. try adding:

model_logits = lifted_reg_model(x_data).squeeze(-1)

jtw · October 26, 2018, 11:33pm

Also I want to clarify: this is a logistic regression model, because you’re passing the output from the linear model as logits rather than as probs. The sigmoid layer is implicit there in the usage of the output of the linear layer as the logit of the Bernoulli distribution.

colinconwell · November 1, 2018, 10:34pm

In case any one stumbles on this page in the future, I’m uploading my own chimeric version of a working model:

gist.github.com

https://gist.github.com/ColinConwell/2fc48c7e6425846171d228257d5d156f

PyroLogistic.ipynb

{
  "cells": [
    {
      "cell_type": "code",
      "source": [
        "import os\n",
        "import numpy as np\n",
        "import torch\n",
        "import torch.nn as nn\n",
        "import scipy.special as ssp\n",

This file has been truncated. show original

Thanks again to @jtw and @jpchen.

kamathhrishi · December 2, 2018, 10:28am

I definitely think it should be part of our example since we assume no probabilitic programming experience we could have people who wouldnt find it intuitive or the examples could make things easier for anybody.