Hello! I’m playing around with a toy example with the Normalizing Flows package and I have really 2 questions. My question is is how do I properly observe samples that I want to be from the same distribution? Also, does my current code snippets using normalizing flows make any sense?
Basically, i create a posterior distribution I want my flow’s model to create (I think it’s called the Moon dataset), but am unsure how exactly to observe then such that I am optimizing my flows parameters correctly. here’s the code snippets:
# Create the dataset, stolen from a tutorial on Normalizing flows batch_size=512 x2_dist = distr.Normal(loc=0.0, scale=4.) x2_samples = x2_dist.sample((batch_size,)) x1 = distr.Normal(loc=.25 * x2_samples.pow(2), scale=torch.ones(batch_size)) x1_samples = x1.sample() x_samples = torch.stack([x1_samples, x2_samples], dim=1) print(x_samples.size()) plt.scatter(x_samples[:,0].numpy(), x_samples[:,1].numpy()) # Definining my flow nf = [InverseAutoregressiveFlow(AutoRegressiveNN(2, ))for i in range(6)] nf_module = nn.ModuleList(nf) def guide(samples, train=True): #guide is empty because I'm not trying to approximate the posterior... return 0.0 mu = torch.zeros(2) sigma = torch.eye(2) def model(samples, train=True): # let pyro know about my flow parameters becuase that's all I want to have updated pyro.module('nf', nf_module) #shape I assume is.... event_shape =2 because this is a multi gaussian dist = TransformedDistribution(MultivariateNormal(mu, sigma), nf) #how do i properly observe all of the samples from the same distribution? #every tutorial seems to be like "throw a plate in there" to declare independence...but what does that mean really? with pyro.plate('batch'): if train: z = pyro.sample('z', dist, obs=samples) #Are these being observd as being from same distribution? else: z = pyro.sample('z', dist) #a less elogant way to sample later to plot return z
Not sure if this is clear enough…but basically I have a 512 x 2 matrix => batch_size x event shape, and am unsure how to have this observed all by the same Transformed distribution. I only have a model function because I want to maximize the log-probability of the model w/ the normalizing flows (which are only learning parameters) and am confused on whether I do either of those things presently.