Text generation Markov Model

Hi everyone,
I am trying to implement a basic Markov model for predicting next character (english letter) generation, given previous character (pairwise model).
The format of the data is an array of ordinal values (0-25) for the given character and have to probabilistic way to generate the next.
P(c2|c1) where c1 is the given character and c2 is the following character
Although I understand this can done by counting and frequency generation, buy I am learning pyro so trying to understand how to setup this model so I can build upon it to create complex models.

  1. Is it necessary for me to convert the data to a tensor of shape (26,26) with counts in there to get things setup or can the model be designed to learn with one row at a time.
  2. Assuming the count matrix is setup, does the below code make sense:
num_characters = 26
def model(counts):
    next_ch_probs = pyro.sample('next_ch_probs', dist.Dirichlet(torch.ones(num_characters,num_characters)/num_characters))
    pyro.sample('counts', dist.Multinomial(26*26, next_ch_probs), obs=counts)
  1. If you have a better way of framing the problem or an example to share, please do. I am trying to learn.

Hmm I think you’ll want to fit a plate full of multinomials

num_characters = 26
def model(counts):
    next_ch_probs = pyro.sample(
        dist.Dirichlet(torch.ones(num_characters, num_characters) / num_characters),
    with pyro.plate("characters", num_characters):
            dist.Multinomial(probs=next_ch_probs, validate_args=False),

where the validate_args=False works around Multinomial's lack of support for heterogeneous counts.