TraceTailAdaptive_ELBO settings


First, thank you for your time :slight_smile:

I am trying to use TraceTailAdaptive_ELBO within the SVI of an VAE and I have some questions:

  1. I can see it does not support subsamplingā€¦so no batch training? :frowning: Will it be implemented at some point?

  2. I am not sure if this is expected but if I am running for example data with these settings:

Training size : 300 sequences
Validation size: 100 sequences
Hidden dimensions of NN in model: 50
Number of particles : 10

The first iteration it runs properly and gives a shape of [300, 50] and the next iteration would add a dimension as such [10, 300, 50], so it raises an error and cannot continue.

I have used the ā€œnumber of particlesā€ flag in other ELBO implementations and I did not have to reshape my data or anything, it was ā€˜automaticallyā€™ sorted out. So, I am guessing it is associated to the compulsory vectorize_particles= True and the attempt to parallelize the computations. So I am wondering if this an expected behavior?, otherwise, it means I have to change data shapes depending on the iterationā€¦I am using a DataLoader (only 1 batch though) and I am not sure how to combine both case scenarios,

I am confused, is there any examples? (I cannot find them on the github)

Thank you very much for your time and attention,

Best wishes


currently there are no plans to extend TraceTailAdaptive_ELBO. note, however, that if you have global latent variables this method (i.e. the math) does not support data subsampling (i.e. this is not specific to this particular implementation). if you have only local latent variables (like in a vanilla VAE) then you are free to do data subsampling, and this interface should work as is.

as for your other question, iā€™m afraid i canā€™t be of much help unless you provide more details (preferably code snippets).

Thanks for your quick reply :slight_smile: !

I am afraid my model contains plenty of global variables. I am using a Seq2Seq configuration. I do need to use several batches but I wanted to see if it was even possible to run it using only 1 batch ā€¦In my head that would ā€˜equalā€™ to no batching??? but it does not seem soā€¦, or it might be a plate problemā€¦ I get confused sometimes with this hehe

The Pseudo model is:

def guide(sequences):
     with pyro.plate("data", batch_size): 
                 gru_output , _ = GRU(sequences, h_0) #Bidirectional!!!
                 z_mean, z_scale = NN1(gru_output) #NN is a non recurrent NN
                 pyro.sample("latent", dist.Normal(z_mean, z_scale)

def model(sequences):
    with pyro.plate("data", batch_size):
                  z = pyro.sample("latent",dist.Normal(z_loc, z_mean))
                  h_0 = NN2(z) #Shape is [300,2,50] = [batch_size,directions, gruHiddenDim]
                  gru_output,_ = GRU(sequences, h_0) #bidirectional!!!
                  means, scales = NN3(gru_output)
                  pyro.sample("words", dist.StudentT(means, scales))
svi = SVI(vae.model,, optimizer, loss=TraceTailAdaptive_ELBO(num_particles=NUM_particles,vectorize_particles=True))

The problem of the shape appears in the second iteration:

line 401, in model
h_0 = h_0.reshape(h_0.shape[0],2,gru_hidden_dim_Model)
RuntimeError: shape '[10, 2, 50]' is invalid for input of size 30000

I hope itā€™s enough information :slight_smile: , I will provide more if necessary,


generally speaking, if your model has global latent variables and you want to use data subsampling the only variational objective you can use (i.e. the only thing allowed by the math) is a vanilla ELBO objective. in Pyro that would mean for example Trace_ELBO and TraceMeanField_ELBO and not objectives like TraceTailAdaptive_ELBO. this is basically because vanilla ELBOs decompose into sums over many log terms, which thus easily support data subsampling (depending on the precise model structure).

however, looking at your model, it appears (?) you do not in fact have global latent variables. perhaps going through this tutorial would be helpful in clarifying your modeling setup.


Yes, thanks for the help. I need to read it up again because I am definitely confused about this :slight_smile: