Better way to apply a transform function

Svetozar · May 24, 2022, 4:44pm

Hello!

It’s here better way to apply transform function to the data?

I have to define model using explicit loop and unnecessary sigma to find theta:

def adstock(x, y):

    theta = pyro.sample('theta', dist.Uniform(0.1, 1))
    sigma = pyro.sample('sigma', dist.Uniform(0, 1))
    d = 0
    z = np.zeros(x.shape)

    for i in range(len(x)):
        d = x[i] + theta*d
        z[i] = pyro.sample(f"obs_{i}", dist.Normal(d, sigma), obs=y[i])

And it works quite slowly.

I want to add this as part of another model like that:

def adstock(x, theta):
    d = torch.tensor(0)
    y = torch.zeros(x.size())
    for i in range(x.size(0)):
        d = x[i] + theta * d
        y[i] = d
    return y

def mix(x, y):

    alpha = pyro.sample('alpha', dist.Uniform(0.1, 10))
    gamma = pyro.sample('gamma', dist.Uniform(0.1, 1))

    theta = pyro.sample('theta', dist.Uniform(0.1, 1))
    adstock_x = adstock(x, theta)

    s_curved = s_curve(adstock_x, alpha, gamma)

    sigma = pyro.sample("sigma", dist.Uniform(0., 1.))

    with pyro.plate("data", len(s_curved)):
        pyro.sample("obs", dist.Normal(s_curved, sigma), obs=y)

But in this case I can’t link theta to the model:

Please help me understand better how to deal with pyro and find good solution

Thank you in advance!

ordabayev · May 24, 2022, 11:33pm

How s_curve function is defined?

Svetozar · May 25, 2022, 11:44am

S_curve function involves only one variable independent from all sequence in fact:

def s_curve(x, alpha = 3, gamma = 0.5):
    gamma = torch.quantile(torch.linspace(x.min().item(), x.max().item(), steps=100), gamma)
    return torch.pow(x, alpha)/(torch.pow(x, alpha)+torch.pow(gamma, alpha))

If I leave only this transform - pyro works pretty quick and exactly estimate alpha and gamma.

I suppose the puzzle in that adstock function do dependent transform between variables at sequence.

martinjankowiak · May 25, 2022, 3:06pm

adstock is a polynomial in x and theta. it can be computed in a vectorized form by e.g. pre-computing the coefficients and passing in an appropriate tensor of x and theta raised to the necessary powers.

note that this is not a pyro question. this question is really about writing vectorized pytorch code.

Svetozar · May 26, 2022, 5:38pm

Yes, you’re right.

I tried to vectorize calculation and theta successfully bind into account.

For that adstock function was modified to:

def roll(A):

    r = np.arange(1,A.shape[1]+1)
    rows, column_indices = np.ogrid[:A.shape[0], :A.shape[1]]
    r[r < 0] += A.shape[1]
    column_indices = column_indices - r[:, np.newaxis]

    return A[rows, column_indices]

def adstock(x, theta):

    x_m = x.flip(0).repeat((len(x),1))
    x_r = torch.tril(roll(x_m)).double()

    th = torch.pow(theta.repeat(len(x)), torch.arange(0,len(x))).double()

    return torch.matmul(x_r, th)

May be not optimal vectorization, but best I can (who knows is it possible to do better without roll?)

As a result single adstock calculated much faster, great!

Thank you!

The conclusion - if explicit loops are used computational graph breaks (interesting why?)

Then calculations over vectors are used - all connection are preserved.

martinjankowiak · May 26, 2022, 6:04pm

the for loop version makes extensive use of in-place tensor ops; these presumably cause problems for provenance tracking