How to write a guide function for neural networks with embedding layers?

Yujian · November 24, 2019, 4:31am

Hi!

I’m a beginner to Pyro and I want to build a neural network with embedding layers to simulate a collaborative filter system. My model has two embedding layers, which take two categorical variables as inputs and convert them to embedding vectors. Then the two vectors will be concatenated to be fed to four fully connected layers. The output is a single number.

It’s kind of like a Bayesian regression. I’ve gone through the tutorial of Bayesian regression and finished the model. But I got stuck at the step of writing my guide. So how to write a guide function for a neural network with two embedding layers and four FC layers?

Thank you!

Here’s my model:

class EmbeddingNet(PyroModule):
    def __init__(self, n_users, n_movies, n_factors, hidden, dropouts, embedding_dropout=0.02):
        super(EmbeddingNet, self).__init__()
        # embedding layers
        self.u = PyroModule[nn.Embedding](n_users, n_factors)

        self.m = PyroModule[nn.Embedding](n_movies, n_factors)

        self.drop = PyroModule[nn.Dropout](embedding_dropout)
        # activation
        self.act = nn.ELU()

        # FC layers
        lb = -0.05
        ub = 0.05

        self.fc1 = PyroModule[nn.Linear](n_factors * 2, hidden[0])
        self.fc1.weight = PyroSample(dist.Uniform(lb, ub).expand([hidden[0], n_factors * 2]).to_event(2))
        self.fc1.bias = PyroSample(dist.Uniform(lb, ub).expand([hidden[0]]).to_event(1))
        self.dp1 = PyroModule[nn.Dropout](dropouts[0])

        self.fc2 = PyroModule[nn.Linear](hidden[0], hidden[1])
        self.fc2.weight = PyroSample(dist.Uniform(lb, ub).expand([hidden[1], hidden[0]]).to_event(2))
        self.fc2.bias = PyroSample(dist.Uniform(lb, ub).expand([hidden[1]]).to_event(0))
        self.dp2 = PyroModule[nn.Dropout](dropouts[1])

        self.fc3 = PyroModule[nn.Linear](hidden[1], hidden[2])
        self.fc3.weight = PyroSample(dist.Uniform(lb, ub).expand([hidden[2], hidden[1]]).to_event(2))
        self.fc3.bias = PyroSample(dist.Uniform(lb, ub).expand([hidden[2]]).to_event(1))
        self.dp3 = PyroModule[nn.Dropout](dropouts[2])

        self.fc4 = PyroModule[nn.Linear](hidden[2], 1)
        self.fc4.weight = PyroSample(dist.Uniform(lb, ub).expand([1, hidden[2]]).to_event(2))
        self.fc4.bias = PyroSample(dist.Uniform(lb, ub).expand([1]).to_event(1))

    def forward(self, user, movie, ratings, minmax=None):
        # get the embedded features
        features = torch.cat([self.u(user), self.m(movie)], dim=1)
        x = self.drop(features)
        # pass features to fc layers
        x = self.dp1(self.act(self.fc1(x)))
        x = self.dp2(self.act(self.fc2(x)))
        x = self.dp3(self.act(self.fc3(x)))
        x_mean = torch.sigmoid(self.fc4(x)).squeeze(-1)

        x_sigma = pyro.sample('sigma', dist.Uniform(0., 1.0))
        if minmax is not None:
            min_rating, max_rating = minmax
            x_mean = x_mean*(max_rating - min_rating + 1) + min_rating - 0.5

        with pyro.plate('data', x.shape[0]):
            obs = pyro.sample('obs', dist.Normal(x_mean, x_sigma), obs=ratings)
        return x_mean

fritzo · November 30, 2019, 5:55pm

Hi @Yujian, I usually start with autoguides or easyguides before I write a custom guide. You might first ensure that an AutoDelta works (this is just MAP estimation), then you could try a completely mean field AutoDiagonalNormal guide, and then an AutoLowRankMultivariateNormal guide.

macio232 · March 25, 2020, 3:38pm

@fritzo and if I wanted to go custom way (or use EasyGuide)? How can I relate the sites from the model to the sites in the guide? Using PyroModule and PyroSample I do not prescribe them any names. I can not find any examples of such usage.

macio232 · March 25, 2020, 7:53pm

Anybody? I really need some help with this.

fritzo · March 27, 2020, 12:41am

Hi @macio232, you can use EasyGuide.group(my_regex) to match a collection of site names in the model, then use

group.sample("some_new_name",
             dist.Normal(0,1).expand(group.event_shape).to_event(1))

to sample a big diagonal normal that will automatically be split into appropriate model sites.

If you can give more context and a bit of example code, maybe we could comment more specifically.

macio232 · March 27, 2020, 4:04pm

class BayesianLinearRegression(PyroModule):
    
    def __init__(self, n_input, intercept=True):
        super().__init__()
        self.linear = PyroModule[nn.Linear](n_input, 1, intercept)
        self.linear.weight = PyroSample(
            dist.Normal(0., 1.).expand([1, n_input]).to_event(2)
        )
        if intercept:
            self.linear.bias = PyroSample(
                dist.Normal(0., 10.).expand([1]).to_event(1)
            )
        self.n_input = n_input
        self.has_intercept = intercept
        
    def model(self, x, y=None):
        sigma = pyro.sample("sigma", dist.Uniform(0., 10.))
        mean = self.linear(x).squeeze(-1)
        with pyro.plate("data", x.shape[0]):
            obs = pyro.sample("obs", dist.Normal(mean, sigma), obs=y)
        return mean

How can I write a guide for such a model? Not using AutoGuide. Or using AutoGuideList with poutine.block. Specifying different distribution for sites in self.linear.weight and self.linear.bias.
Or in other words

fritzo · April 2, 2020, 6:04pm

Hi @macio232,

First, nit you’ll need to rename the .model() method to .forward().

To see Pyro’s automatically created site names you can inspect using poutine.trace, e.g.

model = BayesianLinearRegression(2)
data = torch.randn(3, 2)
with poutine.trace() as tr:
    model(data)
print(tr.trace.nodes.keys())

odict_keys(['sigma', 'linear.weight', 'linear.bias', 'data', 'obs'])

We can even examine shapes:

for name, site in tr.trace.nodes.items():
    print("{}: {}".format(name, site["value"].shape))

sigma: torch.Size([])
linear.weight: torch.Size([1, 2])
linear.bias: torch.Size([1])
data: torch.Size([3])
obs: torch.Size([3])

Now we can write a custom guide using those names and shapes:

class Guide(PyroModule):
    def __init__(self, n_input):
        super().__init__()
        # Let's point estimate sigma.
        self.sigma_loc = PyroParam(torch.tensor(1.),
                                   constraint=constraints.interval(0., 10.))
        # We can be Bayesian about the linear parts.
        self.weight_loc = PyroParam(torch.zeros(1, n_input))
        self.weight_scale = PyroParam(torch.ones(1, n_input),
                                      constraint=constraints.positive)
        self.bias_loc = PyroParam(torch.zeros(1))
        self.bias_scale = PyroParam(torch.ones(1),
                                    constraint=constraints.positive)
    def forward(self, x, y=None):
        pyro.sample("sigma", dist.Delta(self.sigma_loc))
        pyro.sample("linear.weight",
                    dist.Normal(self.weight_loc, self.weight_scale)
                        .to_event(2))
        pyro.sample("linear.bias",
                    dist.Normal(self.bias_loc, self.bias_scale)
                        .to_event(1))

macio232 · April 5, 2020, 7:04pm

I prefer to call my method model (because it is a model) and the write forward as

    def forward(self, *args, **kwargs):
        return self.model(*args, **kwargs)

In the meantime, I came up with the trace idea, but I was hoping for a more straight forward solution. Don’t you think it should be handled in future releases?

fritzo · April 7, 2020, 5:56pm

Hi @macio232,
we’d be happy to make guides easier to write, if you have suggestions. What sort of interface are you envisioning? What are the ugliest parts of the above manual Guide class, in your opinion?
Thanks for any concrete feedback!