I am developing a model that requires calculating a posterior Beta CDF. I spent some time looking for guidance on implementing CDFs in NumPyro, but wasn’t able to find much. I did see that the suggested approach for Pyro is to import from Torch distributions. Is there a similar approach for NumPyro?
EDIT: I was just looking at Torch distributions and noticed that no cdf method is implemented for Beta.
I’ve included a model below with the line:
cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))
Obviously, this doesn’t work, but I thought it might be useful to provide an example of the estimate I’m trying to obtain.
Thanks!
# Generate Data
# specify number of observations for each group
n1 = 2000
n2 = 1000
# Sample observations from different beta distributions for each group, given specified sample size
np.random.seed(seed=123)
y1 = beta.rvs(.5, .5, size=n1)
np.random.seed(seed=321)
y2 = beta.rvs(3, 1, size=n2)
pads = numpy.ones((n1-n2)) * -1
y2 = np.concatenate([y2, pads])
# Combine samples
y = np.array([y1, y2], order='K').T
sample_sizes = np.array([n1, n2])
def model(sample_sizes, y = None):
# Define the number of groups from which we've attained samples
n_groups = sample_sizes.shape[0]
# Initialize a plate for each group
with numpyro.plate("n_groups", n_groups):
# Set priors on a and b within each group
a = numpyro.sample('a', dist.Gamma(.5, .5))
b = numpyro.sample('b', dist.Gamma(3, 3))
# Calculate maximum group size
I = sample_sizes.max().item()
# Create a range of 0:I
i = torch.arange(I).unsqueeze(-1).numpy()
# Initialize a plate for each observation
with numpyro.plate('data', I):
# Mask observations that exceed sample size in the respective plate
with numpyro.handlers.mask(mask = i < sample_sizes):
# Estimate y_hat
y_hat = numpyro.sample('y_hat', dist.Beta(a, b), obs=y)
# Example code, doesn't work
cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))
rng_key = random.PRNGKey(0)
rng_key, rng_key_ = random.split(rng_key)
num_warmup, num_samples = 500, 2000
# Run NUTS.
kernel = NUTS(model)
mcmc = MCMC(kernel, num_warmup, num_samples, num_chains = 3)
start = time.time()
mcmc.run(rng_key_, sample_sizes=sample_sizes, y = y)
end = time.time()
mcmc.print_summary()