# Understanding working memory tutorial

• What tutorial are you running?

Designing Adaptive Experiments to Study Working Memory

• What version of Pyro are you using?
1.8.2

I am attempting to build off the working memory tutorial, and my goal is to try this analysis with a logistic regression model with more features (i.e. more than just 1 as in the tutorial). I’m having a hard time understanding the model as implemented:

``````def model(l):
# Dimension -1 of `l` represents the number of rounds
# Other dimensions are batch dimensions: we indicate this with a plate_stack
with pyro.plate_stack("plate", l.shape[:-1]):
theta = pyro.sample("theta", dist.Normal(prior_mean, prior_sd))
# Share theta across the number of rounds of the experiment
# This represents repeatedly testing the same participant
theta = theta.unsqueeze(-1)
# This define a *logistic regression* model for y
logit_p = sensitivity * (theta - l)
# The event shape represents responses from the same participant
y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
return y
``````

Specifically, I’m confused about `logit_p = sensitivity * (theta - l)`, since I’ve always seen the log odds written as `logit_p = beta_0 + beta_1 * x_1 + beta_2 * x_2 + ... beta_d * x_d`. I tried to implement the model using that approach, like so

``````def model(l):
# Dimension -1 of `l` represents the number of rounds
# Other dimensions are batch dimensions: we indicate this with a plate_stack
with pyro.plate_stack("plate", l.shape[:-1]):
b_0 = pyro.sample("b_0", dist.Normal(prior_mean, prior_sd)).unsqueeze(-1)
b_1 = pyro.sample("b_1", dist.Normal(1, 1)).unsqueeze(-1)

logit_p = b_0 + b_1 * l
y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
return y
``````

but I cannot replicate the results from the tutorial. Any thoughts on what I’m doing wrong here?

Hi dky,

I tried using your logistic model instead of the original one. Could it be that you have to use N(-1,1) as a prior for `b_1`, and not N(1,1)? Because the data generating process assumes that the effect of `l` on `p` is negative (the longer the sequence, the harder it is to remember). I tried running the notebook with this model and guide, and got OK estimates.

``````prior_mean0 = torch.tensor(7.0)
prior_sd0 = torch.tensor(2.0)

prior_mean1 = torch.tensor(-1.0) ## PRIOR MEAN IS -1 INSTEAD OF 1
prior_sd1 = torch.tensor(1.0)

def model(l):
# Dimension -1 of `l` represents the number of rounds
# Other dimensions are batch dimensions: we indicate this with a plate_stack
with pyro.plate_stack("plate", l.shape[:-1]):
theta0 = pyro.sample("theta0", dist.Normal(prior_mean0, prior_sd0)).unsqueeze(-1)
theta1 = pyro.sample("theta1", dist.Normal(prior_mean1, prior_sd1)).unsqueeze(-1)
# Share theta across the number of rounds of the experiment
# This represents repeatedly testing the same participant
# This define a *logistic regression* model for y
logit_p = theta0 + theta1*l
# The event shape represents responses from the same participant
y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
return y

def guide(l):
# The guide is initialised at the prior
posterior_mean0 = pyro.param("posterior_mean0", prior_mean0.clone())
posterior_sd0 = pyro.param("posterior_sd0", prior_sd0.clone(), constraint=positive)

pyro.sample("theta0", dist.Normal(posterior_mean0, posterior_sd0))

posterior_mean1 = pyro.param("posterior_mean1", prior_mean1.clone())
posterior_sd1 = pyro.param("posterior_sd1", prior_sd1.clone(), constraint=positive)

pyro.sample("theta1", dist.Normal(posterior_mean1, posterior_sd1))
``````

Hope this helps!

Hi chvandorp, thank you very much for the response! After posting this yesterday I also realized that the issue was in fact the priors. The `logit_p` in the tutorial is equivalent to `beta_0 + beta_1 * l` where `beta_1 = -1`.

1 Like

Hi, original author of the working memory tutorial here. You are completely correct that the model in the tutorial is equivalent to the conventional form with `beta_1=-1`. This incorporates the assumption that longer sequences are harder to recall. We also incorporated a `sensitivity` parameter which we did not learn. If sensitivity became a regular pyro latent variable, then this would be equivalent to a reparametrised version of the conventional logistic regression model in which `beta_1=-sensitivity` and `beta_0= sensitivity*l`. My reason for parameterizing the model in this way was to make `theta` have a real world interpretation: the length at which the participant has a 50-50 chance of correctly recalling the sequence.

1 Like

Hey, thank you so much for your message! Re: the tutorial, that totally makes sense to only learn one latent variable - having played around with adding more variables, it quickly becomes challenging to interpret how model behavior changes…

The EIG package is great and I’m excited to continue using it!

1 Like