Usage of GammaPossion - statistical rethinking practice problem

rbudnar · February 26, 2022, 5:53pm

Hi all,

I’m learning bayesian modeling using Statistical rethinking with numpyro (without Overview | Statistical Rethinking (2nd ed.) with NumPyro would be very much more challenging) and I’m a bit stumped on how to reproduce the author’s results in practice problem 12H2 (a follow-up question for 12H1).

The author provides the following R code as a solution for 12H1:

library(rethinking)
data(Hurricanes)
d <- Hurricanes
d$fmnnty_std <- ( d$femininity - mean(d$femininity) )/sd(d$femininity)
dat <- list( D=d$deaths , F=d$fmnnty_std )
# model formula - no fitting yet
f <- alist(
  D ~ dpois(lambda),
  log(lambda) <- a + bF*F,
  a ~ dnorm(1,1),
  bF ~ dnorm(0,1) )

m1 <- ulam( f , data=dat , chains=4 , log_lik=TRUE )
precis( m1 )

The output from my solution for 12H1 matches his output and looks like this:

data = pd.read_csv("../../data/Hurricanes.csv", sep=";")

std = lambda x: (x - x.mean())/x.std()
data["femininity_std"] = data["femininity"].pipe(std)

dat = {"F": data["femininity_std"].values, "D": data["deaths"].values}
def model(F, D=None):
    a = numpyro.sample("a", dist.Normal(1,1))
    bF = numpyro.sample("bF", dist.Normal(0,1))
    lambda_ = numpyro.deterministic("lambda_", jnp.exp(a + bF*F))
    numpyro.sample("D", dist.Poisson(lambda_), obs=D)

m12_h1 = MCMC(NUTS(model), num_warmup=500, num_samples=500, num_chains=4)
m12_h1.run(random.PRNGKey(0), **dat)
m12_h1.print_summary(.89)

The summary output looks like this and is pretty much the same as his:

                mean       std    median      5.5%     94.5%     n_eff     r_hat
         a      3.00      0.02      3.00      2.96      3.04   1555.18      1.00
        bF      0.24      0.03      0.24      0.20      0.28   1285.37      1.00

So far so good. However, I move on to 12H2 which asks you to handle over the problem of overdispersion and I get dramatically different results after my conversion. Here’s the author’s model code:

m2 <- ulam(
  alist(
    D ~ dgampois( lambda , scale ),
    log(lambda) <- a + bF*F,
    a ~ dnorm(1,1),
    bF ~ dnorm(0,1),
    scale ~ dexp(1)
  ), data=dat , chains=4 , log_lik=TRUE )

With an output like this:

mean sd 5.5% 94.5% n_eff Rhat4
a 2.98 0.16 2.73 3.23 1597 1
bF 0.21 0.16 -0.05 0.46 1435 1
scale 0.45 0.06 0.35 0.56 1652 1

The code is pretty much the same except he added a scale parameter and swapped out the poisson distribution for a gamma poisson distribution. According to the R docs, dgampois is parameterized by a mu and scale parameter, which is different from numpyro which uses a shape and rate parameter instead. I followed the style of the example code from the numpyro version of R code 12.6 here and did the following:

def model_12_h2(F, D=None):
    a = numpyro.sample("a", dist.Normal(1,1))
    bF =  numpyro.sample("bF", dist.Normal(0,1))
    scale = numpyro.sample("scale", dist.Exponential(1))
    lambda_ = numpyro.deterministic("lambda_", jnp.exp(a + bF*F))
    numpyro.sample("D", dist.GammaPoisson(lambda_/scale, 1/scale), obs=D)

m12_h2 = MCMC(NUTS(model_12_h2), num_warmup=500, num_samples=500, num_chains=4)
m12_h2.run(random.PRNGKey(10), **dat)
m12_h2.print_summary(.89)
m12_h2_samples = m12_h2.get_samples()

With this output:

                mean       std    median      5.5%     94.5%     n_eff     r_hat
         a      2.65      0.11      2.65      2.48      2.84   1357.56      1.00
        bF      0.03      0.10      0.03     -0.13      0.19   1596.73      1.00
     scale     23.63      2.80     23.46     19.22     28.03   1495.34      1.00

The output is off quite a bit on the scale parameter and after a lot of tinkering I’m at a loss as to why this would be the case (the others are off as well but not so drastically).

I’ve been trying to understand how the r-to-numpyro translator came up with the GammaPoisson parameterization. I wasn’t convinced of the above GammaPoission parameterization and I reworked it more explicitly so I could understand it better from what I found in wikipedia and in this blog post:

def model_12_h2(F, D=None):
    a = numpyro.sample("a", dist.Normal(1,1))
    bF =  numpyro.sample("bF", dist.Normal(0,1))

    scale = numpyro.sample("scale", dist.Exponential(1))
    beta = numpyro.deterministic("beta", 1 / scale)
    mu = numpyro.deterministic("mu", jnp.exp(a + bF*F))
    alpha = numpyro.deterministic("alpha", mu * beta)
    numpyro.sample("D", dist.GammaPoisson(alpha, beta), obs=D)

# Output:
                mean       std    median      5.5%     94.5%     n_eff     r_hat
         a      2.65      0.11      2.65      2.49      2.85   1330.39      1.00
        bF      0.03      0.10      0.03     -0.11      0.21   1861.32      1.00
     scale     23.62      2.77     23.43     19.26     27.92   1279.12      1.00

And interestingly enough its pretty much the same as what I had with the first version of my GammaPoisson implementation. That being said, can anyone shed some light on how/why my result is quite different from the SR author’s output?

numpyro.enable_x64() has indeed been run in the code.

Thanks in advance and sorry for the long post!

fehiepsi · February 26, 2022, 11:24pm

Looking like two models are similar to each other. I guess there are bugs either in R implementation or in numpyro. Maybe you can try to simulate two versions to see which one has bugs.

Maybe Rversion has a bug?

rbudnar · February 27, 2022, 2:19am

Thanks for pointing out that issue! It is referencing the exact same practice problem I am; I just assumed I was doing something wrong or misunderstanding something. I’ll take a deeper look into that.

@fehiepsi By the way thanks so much for porting the code for the book over to numpyro! It has been invaluable.