I am trying to learn about Bayesian networks and am really having a hard time to figure out how to setup some simple models.

Say, I have a model as:

```
A -> C <- B
```

i.e. C has A and B as parents. Now, A and B are discrete quantities and say 𝐴∈[0,1,2] and 𝐵∈[0,1,2]. So, I have 9 possible combination of the states of A and B.

So, in the Bayesian network I can model the prior distributions over A and B as Dirichlet distributions. Using the pyro syntax, I have:

```
A = pyro.sample("A", dist.Dirichlet(torch.ones(3)))
B = pyro.sample("B", dist.Dirichlet(torch.ones(3)))
```

Now, if I want to model this child node as a Gaussian conditioned on the parents, is it correct that I need to specify/estimate the parameters for these 9 conditional Gaussians i.e.

```
𝑃(𝐶|𝐴=0,𝐵=0),𝑃(𝐶|𝐴=1,𝐵=0)....𝑃(𝐶|𝐴=2,𝐵=2)
```

So, if I want to give it the full Bayesian treatment, I need to define my priors over the mean and standard deviation of these distributions

```
C_mean|A==0, B==0 = pyro.sample(dist.normal(loc=mean_prior,
scale=mean_std))
C_std||A==0, B==0 = pyro.sample(dist.Gamma(concentration, rate)
...
C_mean|A==2, B==2 = ...
```

My first questions are whether this setup is how it should be and whether there is a succint way to express this conditional distributions in pyro (in reality my discrete parents have like 10 states each).

Additionally, I have some data as:

```
--------------
A | B | C
--------------
0 1 -21.76
1 1 50.5
....
```

I would like to be able to specify the posterior distribution over each of these parameters from above with pyro. Could someone comment on what optimizing setup I should use? I have no latent variables and everything is observed in this case?