I was reading some Pyro code samples from a set of blog posts that convert the R code from MacElreath’s Statistical Rethinking book into Pyro code. One of the things that was a bit confusing was the use of the log probability when drawing samples from a distribution. I was trying to understand why they are using the log here, as I was not sure if it was an artifact of using Pyro, or if there was some technical reason–I think it is the former.
Here is the original R code and then the corresponding Pyro code follows.
p_grid <- seq( from=0 , to=1 , length.out=1000 ) prob_p <- rep( 1 , 1000 ) prob_data <- dbinom( 6 , size=9 , prob=p_grid ) posterior <- prob_data * prob_p posterior <- posterior / sum(posterior) # NOTE THAT THE POSTERIOR IS USED UNLOGGED samples <- sample( p_grid , prob=posterior , size=1e4 , replace=TRUE )
Note that the
posterior variable will give probabilities for each value under the posterior distribution.
Here is the corresponding Pyro code.
p_grid = torch.linspace(start=0, end=1, steps=1000) prior = torch.tensor(1.).repeat(1000) likelihood = dist.Binomial(total_count=9, probs=p_grid).log_prob(torch.tensor(6.)).exp() posterior = likelihood * prior posterior = posterior / sum(posterior) # NOTE THE `POSTERIOR` VARIABLE IS NOW LOGGED???? samples = pyro.distributions.Empirical(p_grid, posterior.log()).sample(torch.Size([int(1e4)]))
Again, I was just trying to understand why to use
posterior.log() in that last line of pyro code? Is that an implementation detail from using Pyro, since the R code does not do that.
Note that I understand the
pyro.distributions.Empirical() function takes either
log_weights, so that makes sense. BUT, is there a computational advantage to providing the logged weights versus the unlogged weights?