Parameters controlling the transformation of other parameters

tmitchel · February 9, 2022, 6:29pm

Hi everyone,

I’m exploring Bayesian Linear Regression for the first time and I’ve got some confusion about defining a model that has learnable parameters controlling the functional form of a transformation applied to other learnable parameters.

For example, I have a diminishing returns transformation applied to some parameters.

def diminishing_returns(spend, half_sat, shape):
    return 1 / (1 + np.power(np.exp(spend / half_sat), -shape))

I want the shape and saturation point of that transformation to also be learnable parameters such that I could use it like

def model():
    sigma = pyro.sample('noise', pyro.distributions.HalfNormal(scale=10))
    baseline = pyro.sample('baseline', pyro.distributions.Normal(loc=0, scale=1))

    half_sat = pyro.sample('half_sat', pyro.distributions.Normal(loc=0, scale=1))
    shape = pyro.sample('shape', pyro.distributions.Normal(loc=0, scale=1))

    channel = pyro.sample(channel, pyro.distributions.HalfNormal(scale=1)) * diminishing_returns(feature[:, "channel"], half_sat, shape)

    mean = baseline + channel

    with pyro.plate('data', len(train_features.index)):
        pyro.sample('obs', pyro.distributions.Normal(loc=mean, scale=sigma), obs=torch.from_numpy(train_labels.to_numpy()))

How can I adjust my diminishing returns function to make this work? Are there any references that address a similar problem?

fritzo · February 9, 2022, 7:21pm

Hi @tmitchel, your implementation is almost already correct, I believe all you’ll need to do is avoid NumPy within Pyro models. Usually I transform everything from NumPy to PyTorch before starting my training loop. That ensures gradients can be propagated, and is also faster.

def diminishing_returns(spend, half_sat, shape):
    return 1 / (1 + torch.pow(torch.exp(spend / half_sat), -shape))
    # Though that seems weird, since it is equivalent to
    #   return 1 / (1 + tprch.exp(-shape * spend / half_sat))
    # wherein (shape, half_sat) appear to be nonidentifiable?

# Convert numpy -> torch before training.
feature_channel = torch.as_tensor(feature[:, "channel"])
data = torch.as_tensor(train_labels)

def model():
    sigma = pyro.sample('noise', pyro.distributions.HalfNormal(scale=10))
    baseline = pyro.sample('baseline', pyro.distributions.Normal(loc=0, scale=1))
    half_sat = pyro.sample('half_sat', pyro.distributions.Normal(loc=0, scale=1))
    shape = pyro.sample('shape', pyro.distributions.Normal(loc=0, scale=1))
    returns = diminishing_returns(feature_channel, half_sat, shape)
    channel = pyro.sample('channel', pyro.distributions.HalfNormal(scale=1)) * returns
    mean = baseline + channel
    with pyro.plate('data', len(data)):
        pyro.sample('obs', pyro.distributions.Normal(loc=mean, scale=sigma), obs=data)

tmitchel · February 9, 2022, 9:11pm

Thanks so much for the fast response. It’s good to know I was on the right track!