Understanding working memory tutorial

  • What tutorial are you running?

Designing Adaptive Experiments to Study Working Memory

  • What version of Pyro are you using?
    1.8.2
  • Please link or paste relevant code, and steps to reproduce.

I am attempting to build off the working memory tutorial, and my goal is to try this analysis with a logistic regression model with more features (i.e. more than just 1 as in the tutorial). I’m having a hard time understanding the model as implemented:

def model(l):
    # Dimension -1 of `l` represents the number of rounds
    # Other dimensions are batch dimensions: we indicate this with a plate_stack
    with pyro.plate_stack("plate", l.shape[:-1]):
        theta = pyro.sample("theta", dist.Normal(prior_mean, prior_sd))
        # Share theta across the number of rounds of the experiment
        # This represents repeatedly testing the same participant
        theta = theta.unsqueeze(-1)
        # This define a *logistic regression* model for y
        logit_p = sensitivity * (theta - l)
        # The event shape represents responses from the same participant
        y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
        return y

Specifically, I’m confused about logit_p = sensitivity * (theta - l), since I’ve always seen the log odds written as logit_p = beta_0 + beta_1 * x_1 + beta_2 * x_2 + ... beta_d * x_d. I tried to implement the model using that approach, like so

def model(l):
    # Dimension -1 of `l` represents the number of rounds
    # Other dimensions are batch dimensions: we indicate this with a plate_stack
    with pyro.plate_stack("plate", l.shape[:-1]):
        b_0 = pyro.sample("b_0", dist.Normal(prior_mean, prior_sd)).unsqueeze(-1)
        b_1 = pyro.sample("b_1", dist.Normal(1, 1)).unsqueeze(-1)
        
        logit_p = b_0 + b_1 * l
        y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
        return y

but I cannot replicate the results from the tutorial. Any thoughts on what I’m doing wrong here?

Hi dky,

I tried using your logistic model instead of the original one. Could it be that you have to use N(-1,1) as a prior for b_1, and not N(1,1)? Because the data generating process assumes that the effect of l on p is negative (the longer the sequence, the harder it is to remember). I tried running the notebook with this model and guide, and got OK estimates.

prior_mean0 = torch.tensor(7.0)
prior_sd0 = torch.tensor(2.0)

prior_mean1 = torch.tensor(-1.0) ## PRIOR MEAN IS -1 INSTEAD OF 1
prior_sd1 = torch.tensor(1.0)
    
def model(l):
    # Dimension -1 of `l` represents the number of rounds
    # Other dimensions are batch dimensions: we indicate this with a plate_stack
    with pyro.plate_stack("plate", l.shape[:-1]):
        theta0 = pyro.sample("theta0", dist.Normal(prior_mean0, prior_sd0)).unsqueeze(-1)
        theta1 = pyro.sample("theta1", dist.Normal(prior_mean1, prior_sd1)).unsqueeze(-1)
        # Share theta across the number of rounds of the experiment
        # This represents repeatedly testing the same participant
        # This define a *logistic regression* model for y
        logit_p = theta0 + theta1*l
        # The event shape represents responses from the same participant
        y = pyro.sample("y", dist.Bernoulli(logits=logit_p).to_event(1))
        return y
        
def guide(l):
    # The guide is initialised at the prior
    posterior_mean0 = pyro.param("posterior_mean0", prior_mean0.clone())
    posterior_sd0 = pyro.param("posterior_sd0", prior_sd0.clone(), constraint=positive)
    
    pyro.sample("theta0", dist.Normal(posterior_mean0, posterior_sd0))
    
    posterior_mean1 = pyro.param("posterior_mean1", prior_mean1.clone())
    posterior_sd1 = pyro.param("posterior_sd1", prior_sd1.clone(), constraint=positive)
    
    pyro.sample("theta1", dist.Normal(posterior_mean1, posterior_sd1))

Hope this helps!

Hi chvandorp, thank you very much for the response! After posting this yesterday I also realized that the issue was in fact the priors. The logit_p in the tutorial is equivalent to beta_0 + beta_1 * l where beta_1 = -1.

1 Like

Hi, original author of the working memory tutorial here. You are completely correct that the model in the tutorial is equivalent to the conventional form with beta_1=-1. This incorporates the assumption that longer sequences are harder to recall. We also incorporated a sensitivity parameter which we did not learn. If sensitivity became a regular pyro latent variable, then this would be equivalent to a reparametrised version of the conventional logistic regression model in which beta_1=-sensitivity and beta_0= sensitivity*l. My reason for parameterizing the model in this way was to make theta have a real world interpretation: the length at which the participant has a 50-50 chance of correctly recalling the sequence.

1 Like

Hey, thank you so much for your message! Re: the tutorial, that totally makes sense to only learn one latent variable - having played around with adding more variables, it quickly becomes challenging to interpret how model behavior changes…

The EIG package is great and I’m excited to continue using it!

1 Like