Dynamic Deep Markov Model

I am trying to use the DMM code and modify it to my requirement. Trying to do the following things:

  1. Number of time steps are different for each sample (sequence)
  2. At each time step, z_{t} generates not a single vector but a sequence of vectors. The number of vectors that the latent code z_{t} generates is not fixed.

I came up with the following generative model that matches my requirements. I was wondering if anyone can comment on the inference in such a model. Would the guide for DMM work in this case as well? Can I write my model in a different way so as to make inference possible?

def next_state(t, z_prev, transitiion_function):
    z_mu, z_sigma = transitiion_function(z_prev)
    z_t = pyro.sample('z_{}'.format(t), dist.normal, z_mu, z_sigma)
    return z_t

def generate_sequence(z_t, emit, pointer, bias_coin):
    hidden = z_t
    data = emit.initInput()
    ps = []
    flip = Variable(torch.Tensor([1]))
    while flip.data[0]==1 and pointer <= len(sequence)-1:
        hidden, out = emit(data, hidden)
        ps.append(out.data)
        data = out
        # pyro.sample('flip_{}'.format(pointer), dist.bernoulli, bias_coin)?
        flip = dist.bernoulli(bias_coin)
        pointer = pointer + 1
        
    ps = Variable(torch.stack(ps, dim=1))
        
    return ps, pointer
    

def model(sequences):
    z_dim = 100
    transition_dim = 100
    data_dim = 39
    emission_dim = 100
    trans = GatedTransition(z_dim, transition_dim)
    emit = Emitter_RNN(data_dim, z_dim, emission_dim)
    z_0 = nn.Parameter(torch.zeros(100))
    z_prev = z_0 
    bias_coin = Variable(torch.Tensor([0.5]))
    
    for i in range(len(sequences)):
        sequence = sequences[i]  
        pointer = 0
        t = 0
        while pointer <= len(sequence)-1:
            t = t + 1
            z_t = next_state(t, z_prev, trans)
            ps, pointer = generate_sequence(z_t, emit, pointer, bias_coin)
            pyro.sample('obs_{}'.format(t), dist.bernoulli, ps)
            z_prev = z_t

PS: the flip statement in the generate_sequence is making the while loop stochastic. I am generating latent states as long as the sequence exists.

Thanks for any insights regarding inference in this model.

hello,

i’m not exactly clear on what you’re doing, but statements like

flip = dist.bernoulli(bias_coin)

without an explicitly named random variable will prevent you from being able to do correct SVI. some generalization of the DMM guide (take a look at the AIR example) should in principle work, although as always, the stochastic optimization problem might be quite difficult.