Beta CDF in NumPyro

I am developing a model that requires calculating a posterior Beta CDF. I spent some time looking for guidance on implementing CDFs in NumPyro, but wasn’t able to find much. I did see that the suggested approach for Pyro is to import from Torch distributions. Is there a similar approach for NumPyro?

EDIT: I was just looking at Torch distributions and noticed that no cdf method is implemented for Beta.

I’ve included a model below with the line:

cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))

Obviously, this doesn’t work, but I thought it might be useful to provide an example of the estimate I’m trying to obtain.

Thanks!

# Generate Data

# specify number of observations for each group
n1 = 2000
n2 = 1000

# Sample observations from different beta distributions for each group, given specified sample size
np.random.seed(seed=123)
y1 = beta.rvs(.5, .5, size=n1)
np.random.seed(seed=321)
y2 = beta.rvs(3, 1, size=n2)

pads = numpy.ones((n1-n2)) * -1
y2 = np.concatenate([y2, pads])

# Combine samples
y = np.array([y1, y2], order='K').T


sample_sizes =  np.array([n1, n2])

def model(sample_sizes, y = None):
    
    # Define the number of groups from which we've attained samples
    n_groups =  sample_sizes.shape[0]
    
    # Initialize a plate for each group
    with numpyro.plate("n_groups", n_groups):
        
        # Set priors on a and b within each group
        a = numpyro.sample('a', dist.Gamma(.5, .5))
        b =  numpyro.sample('b', dist.Gamma(3, 3))
        
        # Calculate maximum group size
        I = sample_sizes.max().item()
        # Create a range of 0:I
        i = torch.arange(I).unsqueeze(-1).numpy()
        
        # Initialize a plate for each observation
        with numpyro.plate('data', I):
            # Mask observations that exceed sample size in the respective plate
            with numpyro.handlers.mask(mask = i < sample_sizes): 
                # Estimate y_hat
                y_hat = numpyro.sample('y_hat', dist.Beta(a, b), obs=y)
                
        # Example code, doesn't work         
        cdf_50 = numpyro.deterministic("cdf_50", dist.Beta(a,b).cdf(.5))

rng_key = random.PRNGKey(0)
rng_key, rng_key_ = random.split(rng_key)

num_warmup, num_samples = 500, 2000

# Run NUTS.
kernel = NUTS(model)
mcmc = MCMC(kernel, num_warmup, num_samples, num_chains = 3)

start = time.time()
mcmc.run(rng_key_, sample_sizes=sample_sizes, y = y)
end = time.time()
mcmc.print_summary()

Could you make a github issue or contribute? This method can be computed using betainc function.

Yep, I’d be happy to open an issue and make a contribution; I just wanted to make sure I wasn’t missing an implementation. Thanks!

1 Like

For reference,

dist.Beta(a,b).cdf(.5)

can be computed as

jax.scipy.special.betainc(.5, a, b)

Depending on what you’re trying to accomplish, one issue here is that betainc function does not have all gradients implemented. There’s a relevant github issue on the jax repo.