Beta CDF in NumPyro

joehoover88 · January 12, 2021, 9:21pm

I am developing a model that requires calculating a posterior Beta CDF. I spent some time looking for guidance on implementing CDFs in NumPyro, but wasn’t able to find much. I did see that the suggested approach for Pyro is to import from Torch distributions. Is there a similar approach for NumPyro?

EDIT: I was just looking at Torch distributions and noticed that no cdf method is implemented for Beta.

I’ve included a model below with the line:

cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))

Obviously, this doesn’t work, but I thought it might be useful to provide an example of the estimate I’m trying to obtain.

Thanks!

# Generate Data

# specify number of observations for each group
n1 = 2000
n2 = 1000

# Sample observations from different beta distributions for each group, given specified sample size
np.random.seed(seed=123)
y1 = beta.rvs(.5, .5, size=n1)
np.random.seed(seed=321)
y2 = beta.rvs(3, 1, size=n2)

pads = numpy.ones((n1-n2)) * -1
y2 = np.concatenate([y2, pads])

# Combine samples
y = np.array([y1, y2], order='K').T


sample_sizes =  np.array([n1, n2])

def model(sample_sizes, y = None):
    
    # Define the number of groups from which we've attained samples
    n_groups =  sample_sizes.shape[0]
    
    # Initialize a plate for each group
    with numpyro.plate("n_groups", n_groups):
        
        # Set priors on a and b within each group
        a = numpyro.sample('a', dist.Gamma(.5, .5))
        b =  numpyro.sample('b', dist.Gamma(3, 3))
        
        # Calculate maximum group size
        I = sample_sizes.max().item()
        # Create a range of 0:I
        i = torch.arange(I).unsqueeze(-1).numpy()
        
        # Initialize a plate for each observation
        with numpyro.plate('data', I):
            # Mask observations that exceed sample size in the respective plate
            with numpyro.handlers.mask(mask = i < sample_sizes): 
                # Estimate y_hat
                y_hat = numpyro.sample('y_hat', dist.Beta(a, b), obs=y)
                
        # Example code, doesn't work         
        cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))

rng_key = random.PRNGKey(0)
rng_key, rng_key_ = random.split(rng_key)

num_warmup, num_samples = 500, 2000

# Run NUTS.
kernel = NUTS(model)
mcmc = MCMC(kernel, num_warmup, num_samples, num_chains = 3)

start = time.time()
mcmc.run(rng_key_, sample_sizes=sample_sizes, y = y)
end = time.time()
mcmc.print_summary()

fehiepsi · January 12, 2021, 10:18pm

Could you make a github issue or contribute? This method can be computed using betainc function.

joehoover88 · January 12, 2021, 10:19pm

Yep, I’d be happy to open an issue and make a contribution; I just wanted to make sure I wasn’t missing an implementation. Thanks!

fehiepsi · January 14, 2021, 10:00pm

For reference,

dist.Beta(a,b).cdf(.5)

can be computed as

jax.scipy.special.betainc(.5, a, b)

dmpowell · May 13, 2021, 12:51am

Depending on what you’re trying to accomplish, one issue here is that betainc function does not have all gradients implemented. There’s a relevant github issue on the jax repo.