Hi all,
I’m learning bayesian modeling using Statistical rethinking with numpyro (without Overview | Statistical Rethinking (2nd ed.) with NumPyro would be very much more challenging) and I’m a bit stumped on how to reproduce the author’s results in practice problem 12H2 (a follow-up question for 12H1).
The author provides the following R code as a solution for 12H1:
library(rethinking)
data(Hurricanes)
d <- Hurricanes
d$fmnnty_std <- ( d$femininity - mean(d$femininity) )/sd(d$femininity)
dat <- list( D=d$deaths , F=d$fmnnty_std )
# model formula - no fitting yet
f <- alist(
D ~ dpois(lambda),
log(lambda) <- a + bF*F,
a ~ dnorm(1,1),
bF ~ dnorm(0,1) )
m1 <- ulam( f , data=dat , chains=4 , log_lik=TRUE )
precis( m1 )
The output from my solution for 12H1 matches his output and looks like this:
data = pd.read_csv("../../data/Hurricanes.csv", sep=";")
std = lambda x: (x - x.mean())/x.std()
data["femininity_std"] = data["femininity"].pipe(std)
dat = {"F": data["femininity_std"].values, "D": data["deaths"].values}
def model(F, D=None):
a = numpyro.sample("a", dist.Normal(1,1))
bF = numpyro.sample("bF", dist.Normal(0,1))
lambda_ = numpyro.deterministic("lambda_", jnp.exp(a + bF*F))
numpyro.sample("D", dist.Poisson(lambda_), obs=D)
m12_h1 = MCMC(NUTS(model), num_warmup=500, num_samples=500, num_chains=4)
m12_h1.run(random.PRNGKey(0), **dat)
m12_h1.print_summary(.89)
The summary output looks like this and is pretty much the same as his:
mean std median 5.5% 94.5% n_eff r_hat
a 3.00 0.02 3.00 2.96 3.04 1555.18 1.00
bF 0.24 0.03 0.24 0.20 0.28 1285.37 1.00
So far so good. However, I move on to 12H2 which asks you to handle over the problem of overdispersion and I get dramatically different results after my conversion. Here’s the author’s model code:
m2 <- ulam(
alist(
D ~ dgampois( lambda , scale ),
log(lambda) <- a + bF*F,
a ~ dnorm(1,1),
bF ~ dnorm(0,1),
scale ~ dexp(1)
), data=dat , chains=4 , log_lik=TRUE )
With an output like this:
mean sd 5.5% 94.5% n_eff Rhat4
a 2.98 0.16 2.73 3.23 1597 1
bF 0.21 0.16 -0.05 0.46 1435 1
scale 0.45 0.06 0.35 0.56 1652 1
The code is pretty much the same except he added a scale parameter and swapped out the poisson distribution for a gamma poisson distribution. According to the R docs, dgampois is parameterized by a mu and scale parameter, which is different from numpyro which uses a shape and rate parameter instead. I followed the style of the example code from the numpyro version of R code 12.6 here and did the following:
def model_12_h2(F, D=None):
a = numpyro.sample("a", dist.Normal(1,1))
bF = numpyro.sample("bF", dist.Normal(0,1))
scale = numpyro.sample("scale", dist.Exponential(1))
lambda_ = numpyro.deterministic("lambda_", jnp.exp(a + bF*F))
numpyro.sample("D", dist.GammaPoisson(lambda_/scale, 1/scale), obs=D)
m12_h2 = MCMC(NUTS(model_12_h2), num_warmup=500, num_samples=500, num_chains=4)
m12_h2.run(random.PRNGKey(10), **dat)
m12_h2.print_summary(.89)
m12_h2_samples = m12_h2.get_samples()
With this output:
mean std median 5.5% 94.5% n_eff r_hat
a 2.65 0.11 2.65 2.48 2.84 1357.56 1.00
bF 0.03 0.10 0.03 -0.13 0.19 1596.73 1.00
scale 23.63 2.80 23.46 19.22 28.03 1495.34 1.00
The output is off quite a bit on the scale parameter and after a lot of tinkering I’m at a loss as to why this would be the case (the others are off as well but not so drastically).
I’ve been trying to understand how the r-to-numpyro translator came up with the GammaPoisson parameterization. I wasn’t convinced of the above GammaPoission parameterization and I reworked it more explicitly so I could understand it better from what I found in wikipedia and in this blog post:
def model_12_h2(F, D=None):
a = numpyro.sample("a", dist.Normal(1,1))
bF = numpyro.sample("bF", dist.Normal(0,1))
scale = numpyro.sample("scale", dist.Exponential(1))
beta = numpyro.deterministic("beta", 1 / scale)
mu = numpyro.deterministic("mu", jnp.exp(a + bF*F))
alpha = numpyro.deterministic("alpha", mu * beta)
numpyro.sample("D", dist.GammaPoisson(alpha, beta), obs=D)
# Output:
mean std median 5.5% 94.5% n_eff r_hat
a 2.65 0.11 2.65 2.49 2.85 1330.39 1.00
bF 0.03 0.10 0.03 -0.11 0.21 1861.32 1.00
scale 23.62 2.77 23.43 19.26 27.92 1279.12 1.00
And interestingly enough its pretty much the same as what I had with the first version of my GammaPoisson implementation. That being said, can anyone shed some light on how/why my result is quite different from the SR author’s output?
numpyro.enable_x64()
has indeed been run in the code.
Thanks in advance and sorry for the long post!