# Building gaussian mixture with Relaxed Bernoulli/Categorical

Hi, I am trying to implement a 1-d Gaussian Mixture of two components with Relaxed Bernoulli (binary case of concrete/gumbel-softmax) as the variational posterior.
However, the model cannot converge to the correct result:

``````p = 0.6
n_sample = 1000
loc1, loc2 = -6.0, 3.0
scale = 0.5
dist.Normal(loc1, scale),
dist.Normal(loc2, scale)).sample()

def model(data):
weights = pyro.param('weights', torch.tensor(0.5))
locs = pyro.param('locs', torch.randn(2,))
with pyro.plate('data', len(data)):
assignment = pyro.sample('assignment', dist.Bernoulli(weights)).long()
pyro.sample('obs', dist.Normal(locs[assignment], 1.0), obs=data)

T = 0.5
def guide(data):
with pyro.plate('data', len(data)):
alpha = pyro.param('alpha', torch.rand(len(data)), constraints.unit_interval)
pyro.sample('assignment', dist.RelaxedBernoulliStraightThrough(torch.tensor(T), probs=alpha))

def train(data, svi, num_iterations):
losses = []
pyro.clear_param_store()
for j in tqdm(range(num_iterations)):
loss = svi.step(data)
losses.append(loss)
return losses

def initialize(seed, data, model, guide, optim):
pyro.set_rng_seed(seed)
pyro.clear_param_store()
svi = SVI(model, guide, optim, Trace_ELBO(num_particles=50))
return svi.loss(model, guide, data)

n_iter = 500
pyro.clear_param_store()
optim = Adam({'lr': 0.1, 'betas': [0.9, 0.99]})
loss, seed = min(
[(initialize(seed, data, model, guide, optim),seed) for seed in range(100)]
)
pyro.set_rng_seed(seed)
svi = SVI(model, guide, optim, loss=Trace_ELBO(num_particles=50))
losses = train(data, svi, n_iter)
``````
``````
pyro.param('locs')

Out[50]:

``````

Is there any way for debugging this model or solving this issue? ( I believe this is caused by local minima problem if my implementation is correct.)

have you tried a lower temperature `T`?

Yes, I tried that, but it did not help. (To reduce the gradient variance caused by low temperature, I also increase `num_particles` to 100)

I also tried to initialize the `locs` parameter to the ground truth mean [-6.0, 3.0]. However, they both shrinkage to somewhere around -0.5 after 1000 iterations:

p.s. Very interestingly, I cannot find any Gumbel Softmax based implementation of mixture model on the Internet.

it may be that `RelaxedBernoulliStraightThrough` is buggy and/or numerically unstable. this distribution hasn’t seen much usage afaik. have you looked at the implementation?

The implementation looks good to me.

In addition, following the test case of one hot categorical: https://github.com/pyro-ppl/pyro/blob/b31963692e176a5099027dd4837c8a4cfe673a75/tests/distributions/test_relaxed_straight_through.py#L60)

I ran the code below

``````pyro.clear_param_store()
def model():
p = torch.tensor([0.8])
pyro.sample('z', Bernoulli(probs=p))

def guide():
q = pyro.param('q', torch.tensor([0.4]), constraint=constraints.unit_interval)
temp = torch.tensor(0.05)
pyro.sample('z', RelaxedBernoulliStraightThrough(temperature=temp, probs=q))

svi = SVI(model, guide, adam, loss=Trace_ELBO(num_particles=100, vectorize_particles=True))

losses = []
for k in range(6000):
loss = svi.step()
losses.append(loss)

print(pyro.param('q'))

``````

Clearly, this “test case” failed.

``````pyro.clear_param_store()
def model(T):
p = torch.tensor([0.8])
pyro.sample('z', RelaxedBernoulli(temp, p))

def guide(T):
q = pyro.param('q', torch.tensor([0.4]), constraint=constraints.unit_interval)
temp = torch.tensor(T)
pyro.sample('z', RelaxedBernoulli(temperature=temp, probs=q))

svi = SVI(model, guide, adam, loss=Trace_ELBO(num_particles=100, vectorize_particles=True))

losses = []
T = 1.0
for k in range(6000):
loss = svi.step(T)
T = max(0.5, T * (0.999 ** k))
losses.append(loss)

print(pyro.param('q'))