Sampling from arbitrary model fitted with GPytorch

rgreen1995 · July 24, 2019, 11:15am

Hi
I’ve been working on a project similar to this paper (https://arxiv.org/pdf/1905.09299.pdf) basically you fit a Gaussian process using posterior samples and can then generate more samples by running mcmc taking the gaussian process to be the likelihood ( + prior) . This is advantageous for example if your likelihood is slow to evaluate (and lots of other reasons)

My question is, if I’ve managed to fit a GP (using GPytorch ) how can I pass this to the model used by pyro or numpyro in mcmc. This would be easy to do in something like emcee but I can’t quite see what the parallel would be in pryo .

I think the main problem is don’t fully understand how the sample handler would work in a situation where I have an arbitrary likelihood function rather than a standard distribution ? I also don’t understand how this would work as there is no observed data i.e. I’m not fitting anything to data I just want to draw samples according to the fitted GP posterior surface

Thanks
Rhys

martinjankowiak · July 25, 2019, 1:02am

if i understand your question you should just be able to use the potential_fn interface as is done here for example

rgreen1995 · July 25, 2019, 1:30pm

Hi Thanks for your reply it’s been really helpful, I’m able to make the code run now but it doesn’t seem to be converging towards the correct model. The correct posterior should be a 2d gaussian but currently it’s just returning a 2d uniform which would appear to be just sampling from the priors I’ve given. My code is the following :

def gp_prob(x,y):
    a = num.array([x,y]).T
    a_scaled = x_scaler.transform(a.reshape(1,-1)).astype(np.float32)
    model.eval()
    likelihood.eval()
    with gpytorch.settings.max_preconditioner_size(10), torch.no_grad():
        with gpytorch.settings.use_toeplitz(False), \
            gpytorch.settings.max_root_decomposition_size(30), \
            gpytorch.settings.fast_pred_var():
            preds = model(to_torch(a_scaled))
    pred_mean = y_scaler.inverse_transform(preds.mean.cpu().numpy())
    return pred_mean

def sample_gp():
    a_prior = Uniform(0,10)
    b_prior = Uniform(0,10)
    a = pyro.sample("a", a_prior)
    b = pyro.sample("b", b_prior)
    return gp_prob(a,b)
 
init_params, potential_fn, transforms, _ = initialize_model(sample_gp, model_args=(),
                                                            num_chains=3)

nuts_kernel = NUTS(potential_fn=potential_fn)
mcmc = MCMC(nuts_kernel,
            num_samples= 8 * 10**2,
            warmup_steps=5 * 10**2,
            num_chains=3,
            initial_params=init_params,
            transforms=transforms)
mcmc.run()

Am I not passing the model correctly ?

martinjankowiak · July 25, 2019, 5:11pm

you’re missing a sample statement in your gp. pyro only sees the things explicitly marked with pyro.sample

rgreen1995 · July 26, 2019, 8:36am

Okay so changing this to

def sample_gp():
    a_prior = Uniform(0,10)
    b_prior = Uniform(0,10)
    a = pyro.sample("a", a_prior)
    b = pyro.sample("b", b_prior)
    return pyro.sample("post", gp_prob(a,b))

returns the error

TypeError                                 Traceback (most recent call last)
<ipython-input-151-16b808c5480b> in <module>()
----> 1 pyro.poutine.trace(sample_gp).get_trace()

~/anaconda3/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py in get_trace(self, *args, **kwargs)
    161         Calls this poutine and returns its trace instead of the function's return value.
    162         """
--> 163         self(*args, **kwargs)
    164         return self.msngr.get_trace()

~/anaconda3/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py in __call__(self, *args, **kwargs)
    141                                       args=args, kwargs=kwargs)
    142             try:
--> 143                 ret = self.fn(*args, **kwargs)
    144             except (ValueError, RuntimeError):
    145                 exc_type, exc_value, traceback = sys.exc_info()

<ipython-input-146-65b780a93e28> in sample_gp()
      4     a = pyro.sample("a", a_prior)
      5     b = pyro.sample("b", b_prior)
----> 6     return pyro.sample("post", gp_prob(a,b))

~/anaconda3/lib/python3.6/site-packages/pyro/primitives.py in sample(name, fn, *args, **kwargs)
    108             msg["is_observed"] = True
    109         # apply the stack and return its return value
--> 110         apply_stack(msg)
    111         return msg["value"]
    112 

~/anaconda3/lib/python3.6/site-packages/pyro/poutine/runtime.py in apply_stack(initial_msg)
    193             break
    194 
--> 195     default_process_message(msg)
    196 
    197     for frame in stack[-pointer:]:  # reversed(stack[0:pointer])

~/anaconda3/lib/python3.6/site-packages/pyro/poutine/runtime.py in default_process_message(msg)
    154         return msg
    155 
--> 156     msg["value"] = msg["fn"](*msg["args"], **msg["kwargs"])
    157 
    158     # after fn has been called, update msg to prevent it from being called again.

TypeError: 'numpy.ndarray' object is not callable

where the original problem steps from

pyro.poutine.trace(sample_gp).get_trace()

Thanks again for your help!

jpchen · July 26, 2019, 6:42pm

gp_prob(a,b)

is a numpy array not a pyro distribution

neerajprad · July 26, 2019, 6:48pm

Hi @rgreen1995,
pyro.sample takes a distribution object as the second parameter which when called returns a sample from the particular distribution. In your case gp_prob(a, b) is just a regular numpy array, and that’s what Pyro is complaining about.

With the current interface in Pyro’s dev branch, I’m not sure if you even need to define a model, but you will need to appropriately transform x, y from constrained ([0, 10] interval that is) to unconstrained domain in your potential function if you don’t use Pyro’s modeling semantics. With your example too, you can probably inject a log density term as follows (not tested):

def sample_gp():
    a_prior = Uniform(0,10)
    b_prior = Uniform(0,10)
    a = pyro.sample("a", a_prior)
    b = pyro.sample("b", b_prior)
    return pyro.sample("post", dist.Delta(torch.tensor(0.), log_density=gp_prob(a,b), obs=torch.tensor(0.)))

This is basically just using a delta function to inject the likelihood term into your model. Let me know if that works.

rgreen1995 · July 30, 2019, 10:38am

Hi thanks again everyone for the help, @neerajprad that seems to work ! however the output is a bit strange, it seems to find the correct posterior in the high density areas,but the contrast between high and low probability regions doesn’t seem to be reflected by the samples So even after 20000 iterations the sampler is still drawing points where there should be almost zero support. The plots below show this point point clearly.

I’m not sure if it just needs more samples but that seems like quite a large burn in, any ideas why this might be the case ?

Thanks again
Rhys

rgreen1995 · July 30, 2019, 4:01pm

It’s also very dependent on the prior- for example changing

def sample_gp():
    a_prior = Normal(0, 10)
    b_prior = Normal(0,10)
    a = pyro.sample("a", a_prior)
    b = pyro.sample("b", b_prior)
    return pyro.sample("post", Delta(torch.tensor(0.),
                                     log_density= gp_prob(a,b)),
                                     obs=torch.tensor(0.))

returns

Of course it’s bound to be dependent on the prior but I’m surprised the results change so drastically

neerajprad · July 30, 2019, 6:19pm

It is hard to say without examining the model run. I have some questions / tips for further debugging:

All the operations on a, b in gp_prob should be torch operations so that they are visible to the autograd engine. Therefore conversion to numpy might be problematic. Pyro’s HMC needs to be able to compute the derivative of your PE function w.r.t. all the sample parameters.
What is the acceptance rates that you are seeing in your sampling? And what are the neff and r_hat statistics from mcmc.summary()?
For b, why do we see values like -1? That should clearly be outside the support. Are you plotting the constrained or unconstrained values?
Could you also try other priors like a half cauchy prior?