Adding nonlinearity to AutoIAFNormal guide

dodi56 · February 10, 2022, 2:12pm

Hi!
Is there a way to add non-linearity to AutoIAFNormal such as ELU?

Thanks!

martinjankowiak · February 10, 2022, 4:35pm

hello can you provide (significantly) more detail about what you mean?

dodi56 · February 11, 2022, 7:39am

Sorry about that, I would like to add ELU activation between IAF flows as done in the paper of Hoffman et al., 2019 ([1903.03704] NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport). I looked in the source codes and couldn’t find where I can add it.
And maybe it is also a good place to ask if doing
guide = AutoIAFNormal(model, hidden_dim, num_transforms)
is comparable to the following code (I took from probability/neutra_kernel.py at main · tensorflow/probability · GitHub) in TensorFlow:


def make_iaf_stack(total_event_size,
                   num_hidden_layers=2,
                   seed=None,
                   dtype=tf.float32):
  """Creates an stacked IAF bijector.

  This bijector operates on vector-valued events.

  Args:
    total_event_size: Number of dimensions to operate over.
    num_hidden_layers: How many hidden layers to use in each IAF.
    seed: Random seed for the initializers.
    dtype: DType for the variables.

  Returns:
    bijector: The created bijector.
  """

  seed = tfp.util.SeedStream(seed, 'make_iaf_stack')

  def make_iaf():
    """Create an IAF."""
    initializer = tf.compat.v2.keras.initializers.VarianceScaling(
        2 * 0.01, seed=seed() % (2**31 - 1))

    made = tfb.AutoregressiveNetwork(
        params=2,
        event_shape=[total_event_size],
        hidden_units=[total_event_size] * num_hidden_layers,
        activation=tf.nn.elu,
        kernel_initializer=initializer,
        dtype=dtype)

    def shift_and_scale(x):
      # TODO(siege): Something is losing the static shape.
      x.set_shape(
          x.shape.merge_with([None] * (x.shape.ndims - 1) + [total_event_size]))
      return tf.unstack(made(x), num=2, axis=-1)

    return tfb.Invert(tfb.MaskedAutoregressiveFlow(shift_and_scale))

  def make_swap():
    """Create an swap."""
    permutation = list(reversed(range(total_event_size)))
    return tfb.Permute(permutation)

  bijector = make_iaf()
  bijector = make_swap()(bijector)
  bijector = make_iaf()(bijector)
  bijector = make_swap()(bijector)
  bijector = make_iaf()(bijector)
  bijector = make_swap()(bijector)

  return bijector

Basically, I am trying to implement the TensorFlow code above in Pyro.

martinjankowiak · February 11, 2022, 2:18pm

assuming your goal is to do variational inference i think you’ll want to use AutoNormalizingFlow and pass a custom transform to it. in particular take a look at AffineAutoregressive in distributions/transforms/affine_autoregressive.py. the way this API was setup there’s some amount of flexibility but not infinite flexibility so i don’t think you’ll be able to get what you want with e.g. 3 lines of code.