Boolean constraints NotImplementedError


#1

Hi, I want to set Boolean constraints for parameters, like

drop = pyro.param('drop',t.tensor([0,1]),constraint=constraints.boolean)

But I got an error said NotImplementedError: Cannot transform _Boolean constraints.

It seems there is no implement for constraints.boolean.

Is there any other way to constrain the value of parameters to {0,1} ?

Thanks!


#2

Hi, pyro.param is for things you want to optimize, typically with PyTorch’s gradient-based optimizers. You can’t directly perform gradient-based optimization on discrete parameters, and there’s no smooth transformation from continuous to discrete values, hence the error. Do you really need to optimize over discrete parameters? If so, see the Bayesian optimization tutorial. Otherwise, there’s no need to use pyro.param at all.


#3

Thanks a lot. Actually, It’s no need to optimize over discrete parameters. I 'm implementing a zero-inflated model as below, but the loss of SVI is always inf.

I followed the suggestion in

Is there any bug in my code?

pyro.clear_param_store()
N,D = Expr.shape # samples * features
def model(Expr=None):
    with pyro.plate('sample_dim',N) as idx:
        real_expr = pyro.sample('real_expr',dist.MultivariateNormal(t.ones(D),t.diag(t.ones(D))))
        mask = pyro.sample('mask',dist.Bernoulli(t.rand(N,D)).to_event(1))
        obs_value = real_expr*mask
        obs = pyro.sample('obs',dist.Delta(obs_value).to_event(1),obs=Expr)

def guide(Expr=None):
    p = pyro.param('p',t.rand(N,D),constraint=constraints.unit_interval)
    with pyro.plate('sample_dim',N) as idx:
        real_expr = pyro.sample('real_expr',dist.MultivariateNormal(t.ones(D),t.diag(t.ones(D))))
        mask = pyro.sample('mask',dist.Bernoulli(p).to_event(1))

optim = pyro.optim.Adam({'lr': 0.2, 'betas': [0.8, 0.99]})
elbo = Trace_ELBO()
svi = SVI(model, guide, optim, loss=elbo)
svi.loss(model, guide, Expr)   # loss is inf

I appreciate any suggestion.


#4

I’d suggest starting with a MaskedMixture distribution rather than hand-coding the masking logic. I think your two component distributions would be a MultivariateNormal and a Delta(...).to_event(1). See these tests for example usage.

I’m not sure about the model you shared, but it usually doesn’t work to observe a Delta distribution (i.e. usually results in NAN loss).