Model Based Machine Learning Book Chapter 2 Skills example in Pyro- tensor dimension issue

Thanks, @fritzo. Actually, I had not made that particular connection. Good to know though. So with to_event(1) the variables are considered dependent, while with plate, they are independent. Is it that simple?

So the only time neither plate or to_event is required is when we are dealing with a scalar global variable?

So let me ask both @fritzo and Jeffmax:

If it is a choice between to_event(1) and plate, why not choose plate since the individual questions are independent of each other. Would the code not run faster? Could you please explain? Thanks.

Gordon

I am still confused on how to do inference in this model. I have inferred guess_probs using SVI and retrieved the skill sets using infer_discrete on the model. Given the model, the skill_set can only be 0 or 1, correct? However, in Chapter 2 of Winn & Bishop, Figure 2.7 suggests that the skills are real numbers between 0 and 1. I do not understand how to achieve that using the model they are using, or with the model and guide below. Does this sound confusing?

def complete_model_tensor():
    alpha0 = torch.ones(48) * 2.5
    beta0 = torch.ones(48) * 7.5

    guess_probs = pyro.sample('guess_prob',
                              dist.Beta(alpha0, beta0).independent(1))

    with pyro.plate("participants", 22):
        skills = []
        for i in pyro.plate("skills", 7):
           skills.append(pyro.sample("skill_{}".format(i),
                                     dist.Bernoulli(0.5),
                                     infer={"enumerate": "parallel"}))

        for q in pyro.plate("questions", 48):
            has_skills = reduce(operator.mul,
                                [skills[i] for i in skills_needed[q]]).float()
            prob_correct = has_skills * (1 - prob_mistake) + (1 - has_skills) * guess_probs[q]
            pyro.sample("isCorrect{}".format(q),
                        dist.Bernoulli(prob_correct))

def complete_model_tensor_guide():
    guess_prob_a = pyro.param('guess_prob_a', torch.ones(48) * 4,
                              constraint=constraints.positive)
    guess_prob_b = pyro.param('guess_prob_b', torch.ones(48) * 4,
                              constraint=constraints.positive)
    guess_probs = pyro.sample('guess_prob',
                              dist.Beta(guess_prob_a , guess_prob_b).independent(1))

My understanding of this (which could be wrong) is the following. I welcome any corrections!

Some algorithms that are able to take advantage of the independence assumptions could run slower if .to_event is used in a place where there is actual independence., but the answers shouldn’t be wrong. Sometimes I suspect .to_event is used because it is simpler to type in the code and may not be meaningfully different in a particular situation than using plate. I think, while not necessarily called out that often in the docs, especially if you turn on validation, Pyro wants all the sites to be annotated with plate or to_event, so you have to do one of them.

In my case, I use .to_event() for a variable x to specify that event_shape of the distribution which generates x is x.shape. I’ll use plate to denote the independence.

Sometimes, when they are equivalent and x.dim() >= 2 then I’ll use .to_event() to simply the code as @jeffmax pointed out. But I don’t recommend doing that way. Two plate statements are more explicit IMO.

In the following enumerated model, skills are enumerated. I count 2^(22*7) terms, which is huge, and yet it works. What is the real count? Thanks.

with pyro.plate("participants", 22):
        skills = []
        # Enumerate over skills (2 values for each skill)
        for i in pyro.plate("skills", 7):
           # skills: 0 or 1
           skills.append(pyro.sample("skill_{}".format(i),
                                     dist.Bernoulli(0.5),
                                     infer={"enumerate": "parallel"}))

        for q in pyro.plate("questions", 48):
            # skills_needed[q] is a list of skills need for qth question
            has_skills = reduce(operator.mul,
                                [skills[i] for i in skills_needed[q]]).float()
            prob_correct = has_skills * (1 - prob_mistake) + (1 - has_skills) * guess_probs[q]
            # Conditioned on the data. Done outside the routine
            is_correct = pyro.sample("isCorrect{}".format(q),
                        dist.Bernoulli(prob_correct))

I am currently trying to wrap my head around the pyro plate system and try to get my pyro notebook for chapter 2 up and running.

Therefore, I wondered if you can recommend resources next to the official tutorials?
Maybe you have your latest notebooks also somewhere online?
I wonder if you can use “a more vectorized” approach for the model in this chapter?

Any help would be greatly appreciated. :slight_smile:

@MicPie did you try using the code in the post above? That mostly worked for me, however I ended up needing to use MCMC to get answers close to the book in a reasonable amount of time.