# Why does my inferenced probability larger than 1 in categorical distribution?

I am creating my first pyro program.
The task is I have a couple of categorical data and want to estimate the distribution on them.
There are four categories A, B, C, D. I assume the categorical distribution with the prior distribution:
P(A) = 0.25, P(B) =0.25, P(C ) = 0.25, P(D) = 0.25.

I define the model as:

``````    def data_model(prior_prob):
obs = np.array([0 if random.random() >0.2 else 1 for i in range(1000)]) # This is the synthesized data for test purpose
obs = torch.from_numpy(obs)
data = pyro.sample("data",dist.Categorical(probs=prior_prob),obs=obs)
return data
``````

I define the guide function:

``````def param_guide(prior_prob):
posterior = pyro.param("probs", torch.tensor(prior_prob),constraint=constraints.positive)
data = pyro.sample("data", dist.Categorical(posterior))
return data
``````

And I use SVI to inference the posterior probability distribution:

``````    prior_prob = np.array([0.25, 0.25,0.25,0.25])
prior_prob = torch.from_numpy(prior_prob)
pyro.clear_param_store()

svi = pyro.infer.SVI(model=data_model,
guide=param_guide,
optim=pyro.optim.SGD({"lr": 0.001, "momentum":0.1}),
loss=pyro.infer.Trace_ELBO())

losses, probs  = [], []
num_steps = 2500
for t in range(num_steps):
losses.append(svi.step(prior_prob))
probs.append(pyro.param("probs"))
``````

Finally, I found my probs = [7.5012e-03, 3.6106e-02, 1.0009e-02, 1.4410e+03]
Two problems here:

1. The sum of the probabilities is larger than 1.
2. I generate the data by the probabilities [0.8, 0.2, 0, 0], why do not the estimated value approach it?

there are many issues here. i suggest you read more of the tutorials, especially svi part i and parts ii and iii.

for example (there may be other issues):

• you can’t put numpy generated randomness in your model like this. you’re going to resample the data every time the model is run. instead generate your data once, outside the model.

`obs = np.array([0 if random.random() >0.2 else 1 for i in range(1000)]) # This is the synthesized data for test purpose`

• your `probs` need a `simplex` constraint not a `positive` constraint

• you probably don’t want to use `SGD` for `SVI`. try `Adam`

`````` for i in range(len(data)):
` pyro.sample("obs", dist.Bernoulli(f), obs=data)`