SVI with a singular cov matrix for the model: does it fit?

campagne · November 13, 2021, 2:51pm

Hi,
I have a model of the following form:

# in the following, 
# the_obs is a (2250,) 1D-vector  and P & C are (2250,2250) matrices
# they are all computes before the following model definition.

def model(the_obs=None):
  # define priors with Uniform and Normal distribution
  var0 = numpyro.sample('var0',dist.Uniform(-5,5))  # this is an example
 ...
  var20=  numpyro.sample('var20',dist.Norm(0,5))   # this is an example

 signal = my_func(var0,...,var20)  # a complex external function

# then the likelihood
return numpyro.sample('obs', dist.MultivariateNormal(signal, 
                                      precision_matrix=P,
                                      covariance_matrix=C), obs=the_obs)

Then, whiling to perform a SVI with a MultivariateNormal approx., I do

guide = autoguide.AutoMultivariateNormal(model_spl, init_loc_fn=numpyro.infer.init_to_median())
optimizer = numpyro.optim.Adam(5e-3)
svi = SVI(model, guide,optimizer,loss=Trace_ELBO())
svi_result = svi.run(jax.random.PRNGKey(0), 1000, the_obs) 
samples = guide.sample_posterior(jax.random.PRNGKey(1), svi_result.params, sample_shape=(8000,))

The problem is that I was not satisfied of the result with the sampling leading to means of var0 to var20 deconnected to the true values that was used to compute the_obs and Cand P. Digging, to tackle a bug, I realize that the covariant matrix C is singular: ie. its rank computed thanks to

np.linalg.matrix_rank(C)
2157

So rank=2157 is smaller than the dimension =2250. So, the determinant of C is 0.0 (this is the result of np.linalg.det), although one can compute its inverse (ie. pseudo-) leading to P-matrix, and also the Cholesky decomposition trough np.linalg.Cholesky leads to a matrix those determinant is also 0.0.

My questions:

does the singularity of C may be the source of SVI MVN convergence behaviour?
is there a way to encompass this singularity problem?

fehiepsi · November 13, 2021, 3:16pm

It is due to your function that calculates C. Guide is not much relevant here I think. Your function generates bad C given some valid parameters generated by the guide - so you need to look into that. Typically you can add some small positive number to the diagonal of C to make it positive definite.

campagne · November 14, 2021, 1:43pm

Hi @fehiepsi

I have reduced the observation modelling and adjust the numpyro model conscequently. Now the Covariant matrix of the observation is full rank. But, it is a bloc diagonal matrix (500x500) with elements in the range 1e-21 - 1e-17. So, the determinant is float64 precision the determinant is 0.0, while the rank is 500. How, the init_scale of AutoMultivariateNormal may influence the SVI optimisation?