Behaviour of pyro.deterministic

sharrison · January 31, 2020, 12:10pm

Two quick questions related to pyro.deterministic:

Using deterministic primitives leads to lots of instances of RuntimeWarning when sampling from the model outside of an SVI instance (e.g. below). Is there a way to avoid these (aside from warnings.filterwarnings())?
Are there side effects from including deterministic primitives with the same name in the model and guide? I couldn’t quite get my head around what the implications of replaying multiple observed sample statements against one another are.

Thank you!!

import torch, pyro
pyro.deterministic('test', torch.tensor(1.0))

RuntimeWarning: trying to observe a value outside of inference at test

fritzo · February 1, 2020, 7:48pm

Using deterministic primitives leads to lots of instances of RuntimeWarning

Thanks for reporting, this is just a bug. pyro.deterministic is fairly new and we haven’t worked out all the edge cases. We’ll remove the RuntimeWarning before next release :

Are there side effects from including deterministic primitives with the same name in the model and guide? I couldn’t quite get my head around … the implications

Hmm I think it should be unnecessary to include pyro.deterministic statements in the guide, and I’d expect Pyro to error in such a case. Can you give an example of when you’d like to include a pyro.deterministic statement in a guide?

sharrison · February 5, 2020, 10:16am

Perfect, thank you!!

Re the use of pyro.deterministic in the guide, it came from my other slightly confused question about ways to reuse code in model(). As you say, I think it is unnecessary, though Pyro doesn’t error.

macio232 · March 17, 2020, 12:10pm

@fritzo is pyro.deterministic working in pyro-ppl==1.3.0? It seams to have no effect and I really wanted to use it.

Input:

def model(x, y=None):
    multiplier = 2
    mean = pyro.sample("mean", dist.Normal(5., 10.))
    sigma = pyro.deterministic("sigma", torch.abs(x*multiplier) + 1)
    with pyro.plate("data", x.shape[0]):
        return pyro.sample("obs", dist.Normal(mean, sigma), obs=y)

x = torch.distributions.Bernoulli(0.6).sample((100,))
y = model(x)

def guide(x, y=None):
    mean_mean = pyro.param("mean_mean", torch.Tensor([1]))
    mean = pyro.sample("mean", dist.Normal(mean_mean, 1.))
    multiplier = pyro.param('multiplier', torch.Tensor([7]))
    sigma = torch.abs(x*multiplier) + 1
    with pyro.plate("data", x.shape[0]):
        return pyro.sample("obs", dist.Normal(mean, sigma))


pyro.clear_param_store()
svi = SVI(model, guide, pyro.optim.Adam({"lr": 1e-3}), loss=Trace_ELBO())
for _ in range(1000):
    svi.step(x, y)
dict(pyro.get_param_store())

Output:

{'mean_mean': tensor([1.9848], requires_grad=True),
 'multiplier': tensor([7.9753], requires_grad=True)}

Edit: Ok, seems to work You have to call pyro.sample on the final observable variable in the guide with is normaly not needed (or even no recomended).

Update: It turns out that deterministic does not work when passed as mean for Normal (did not check on other distributions). For that purpose I replaced it with pyro.sample from Delta (needed in the model and in the guide). @fritzo can you comment on this?

macio232 · March 19, 2020, 1:40pm

Delta yields infinite elbo so there is no solution

fehiepsi · March 19, 2020, 5:53pm

I think deterministic should work in recent releases. There are some problems with your model/guide:

I think obs should not appear in guide
If you want to get values of sigma, which is transformed from multiplier, you should replace multiplier=2 by multiplier = pyro.param('multiplier', ...) in your model. You can use Predictive to get values of any sites that you want, including deterministic sites.

macio232 · March 20, 2020, 12:09pm

Without obs in guide, multiplier does not optimize.

I dont’t want to get values of sigma - it is obvious how to obtain it. I want to pass pyro.deterministic to Normal and optimize parameters. And it does not work.
Input:

def model(x, y=None):
    multiplier = 2
    sigma = pyro.sample("sigma", dist.Exponential(1))
    mean = pyro.deterministic("mean", torch.abs(x*multiplier) + 1)
    with pyro.plate("data", x.shape[0]):
        return pyro.sample("obs", dist.Normal(mean, sigma), obs=y)

def guide(x, y=None):
    sigma_rate = pyro.param("sigma_rate", torch.Tensor([3]))
    multiplier = pyro.param('multiplier', torch.Tensor([7]))
    mean = torch.abs(x*multiplier) + 1
    sigma = pyro.sample("sigma", dist.Exponential(sigma_rate))
    with pyro.plate("data", x.shape[0]):
        return pyro.sample("obs", dist.Normal(mean, sigma))

x = torch.distributions.Bernoulli(0.6).sample((100,))
y = model(x)

pyro.clear_param_store()
svi = SVI(model, guide, pyro.optim.Adam({"lr": 1e-3}), loss=Trace_ELBO())
for _ in range(1000):
    elbo = svi.step(x, y)
dict(pyro.get_param_store())

Output:

{'sigma_rate': tensor([2.9714], requires_grad=True),
 'multiplier': tensor([7.], requires_grad=True)}

fehiepsi · March 20, 2020, 3:37pm

Could you try that? Currently, multiplier is the constant 2 in your model, hence the parameter multiplier is not optimized. If you want to use 2 to generate data and 7 as init value, you can define a global variable mval:

def model():
    multiplier = pyro.param('multiplier', mval)`
    ...

mval = torch.tensor(2.)
y = model(x)

pyro.clear_param_store()
mval = torch.tensor(7.)
... # svi

Another way is

multiplier = pyro.param('multiplier', lambda: torch.tensor(2.)
                        if y is None else torch.tensor(7.))`

macio232 · March 20, 2020, 3:57pm

Can I put pyro.param in a model? I don’t think so.

I get the following error in such a setting:

RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

I do not care much about the initial value in the guide. In a real world scenario I would initialize it randomly. Here I wanted to initialize it with specific value to track the optimization process.

fehiepsi · March 20, 2020, 4:04pm

Probably you need to detach y? i.e. y = y.detach()

macio232 · March 20, 2020, 4:09pm

Yes, of course! Thank you @fehiepsi

And now sampling obs can be removed from guide.

So how it is possible that it worked previously when multiplier has been applied in scale parameter? The answer is not crucial for further applications of deterministic.