CorrMatrixCholeskyTransform Not Bijective? Or bugged?

student_12 · March 10, 2024, 8:44am

I’m running into a linalg.cholesky error (due to non-positive definite input) caused by issues with the CorrMatrixCholeskyTransform seemingly not being bijective (even though it has bijective = True?).

For context… I’m trying to transform a vector (a sample from a multivariate normal) into a correlation matrix in a custom guide. The corr_matrix constraint is enforced by (internally in Pyro through biject_to registration) using the transforms:

ComposeTransform[CorrCholeskyTransform, Inverse(CorrMatrixCholeskyTransform)]

where the input to output chain goes like:

real vector → lower cholesky factor → correlation matrix

The inverse of this (invoked when using the log_abs_det_jacobian() method) is thus just the reverse direction with inverses:

ComposeTransform[CorrMatrixCholeskyTransform, Inverse(CorrCholesky)]

where the input to output goes like:

correlation matrix → lower cholesky factor → real vector

The problem I’m running into is that the original transform chain produces a correlation matrix, but when I then use the inverse transform chain (which it does when using log_abs_det_jacobian()), this correlation matrix is input into CorrMatrixCholeskyTransform first, which is basically doing cholesky(correlation matrix) and this can produce the error below (that the correlation matrix isn’t positive definite, so it can’t do a cholesky decomp on it):

torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite

Thus, does this mean CorrMatrixCholeskyTransform isn’t actually bijective, since the inverse of this transform can output correlation matrices that the original transform can’t always accept as input? And is there some way to get around this?

As a code example and proof, I did some logging during optimization, and this recreates the error:

from torch.distributions.transforms import CorrCholeskyTransform
from pyro.distributions.transforms import CorrMatrixCholeskyTransform

trans1 = CorrCholeskyTransform()
trans2 = CorrMatrixCholeskyTransform()
real_vect = torch.tensor([-0.2894, -1.0836, -0.4998,  1.6825,  0.1669,  1.0653, -1.2977,  1.1127,
        -2.8873,  2.3409,  3.4971,  2.0659, -0.2017,  2.7351,  1.2209, -0.6655,
         1.5769,  2.1106, -0.8692, -1.5110, -0.4753, -0.4464,  1.0486,  1.8616,
        -0.3846, -0.4477, -0.0422, -1.5995, -0.2185, -3.7241, -0.1611, -3.0001,
        -1.0150,  2.5573,  3.0970, -1.6133])

# Forward
lower_cholesky = trans1(real_vect)  # from real vector to lower triangular cholesky correlation factor
corr_mat = trans2.inv(lower_cholesky)  # from corr factor to correlation matrix

# Inverse.
lower_cholesky = trans2(corr_mat)  # <-- error (corr_mat isn't positive definite... must be positive semidefinite?)
trans1.inv(lower_cholesky)

student_12 · March 10, 2024, 9:42am

After looking at the codebase/documentation, it seems like CorrMatrixCholeskyTransform expects CholeksyTransform to be using safe_cholesky() but it’s instead using torch.linalg.cholesky(). I found a function for safe_cholesky (which adds jitter in case of error) in another codebase and tried that function… I didn’t get this same error yet when running optimization but got a different error first instead…

The new error is that the below correlation matrix seems to be a valid correlation matrix to me however, it fails at the CorrMatrix constraint check() method (this constraint is because I’m using the correlation matrix as a variational distribution for an LKJ distribution that has CorrMatrix as its support) because it’s not positive definite:

torch.tensor([
        [ 1.0000, -0.2816, -0.7945,  0.9332, -0.8611,  0.9982, -0.5820, -0.4189, -0.2151],
        [-0.2816,  1.0000, -0.0454, -0.2057,  0.6352, -0.2248,  0.8803,  0.7987, -0.8754],
        [-0.7945, -0.0454,  1.0000, -0.6078,  0.4080, -0.8111,  0.4216,  0.4247, 0.4405],
        [ 0.9332, -0.2057, -0.6078,  1.0000, -0.8558,  0.9373, -0.4232, -0.2118, -0.2709],
        [-0.8611,  0.6352,  0.4080, -0.8558,  1.0000, -0.8342,  0.7108,  0.4867, -0.2133],
        [ 0.9982, -0.2248, -0.8111,  0.9373, -0.8342,  1.0000, -0.5390, -0.3793, -0.2725],
        [-0.5820,  0.8803,  0.4216, -0.4232,  0.7108, -0.5390,  1.0000,  0.9465, -0.6028],
        [-0.4189,  0.7987,  0.4247, -0.2118,  0.4867, -0.3793,  0.9465,  1.0000, -0.6029],
        [-0.2151, -0.8754,  0.4405, -0.2709, -0.2133, -0.2725, -0.6028, -0.6029, 1.0000]])

However, valid correlation matrices can be positive semi-definite right? The above correlation matrix seems valid to me (it’s symmetric, 1s on the diagonal, and all off-diagonals between -1 and 1), is something wrong with it that it shouldn’t pass the CorrMatrix check? So shouldn’t the check be to verify the matrix is positive semidefinite instead of positive definite? If not, is there a workaround for this?

student_12 · March 10, 2024, 11:34pm

After doing a little more playing around, I think I can summarize the problem below…

In summary:

corr_matrix constraint is enforced by the transform (verified in the biject_to registry):

ComposeTransform[CorrCholeskyTransform, Inverse(CorrMatrixCholeskyTransform)]

which transforms a real vector into a correlation matrix.

However, CorrCholeskyTransform’s _call() can sometimes transform a real vector input into the cholesky factor of a matrix that isn’t positive definite (or even semidefinite) probably due to eigenvalues near 0 getting rounded to e.g., -4.83e-09. Then when we apply the CorrMatrixCholeskyTransform.inv(), it uses this cholesky factor to produce a correlation matrix that isn’t positive definite.

But the constraint.corr_matrix requires that matrices be positive definite. Thus, the transform registered in biject_to to enforce this constraint can itself produce outputs that don’t obey the constraint. This issue can be reproduced with the example I had in the previous post:

from torch.distributions.transforms import CorrCholeskyTransform
from pyro.distributions.transforms import CorrMatrixCholeskyTransform

trans1 = CorrCholeskyTransform()
trans2 = CorrMatrixCholeskyTransform()
real_vect = torch.tensor([-0.2894, -1.0836, -0.4998,  1.6825,  0.1669,  1.0653, -1.2977,  1.1127,
        -2.8873,  2.3409,  3.4971,  2.0659, -0.2017,  2.7351,  1.2209, -0.6655,
         1.5769,  2.1106, -0.8692, -1.5110, -0.4753, -0.4464,  1.0486,  1.8616,
        -0.3846, -0.4477, -0.0422, -1.5995, -0.2185, -3.7241, -0.1611, -3.0001,
        -1.0150,  2.5573,  3.0970, -1.6133])

# Forward
lower_cholesky = trans1(real_vect)  # from real vector to lower triangular cholesky correlation factor
corr_mat = trans2.inv(lower_cholesky)  # from corr factor to correlation matrix

# Inverse.
lower_cholesky = trans2(corr_mat)  # <-- error (corr_mat isn't positive definite)
trans1.inv(lower_cholesky)

Not sure what the best fix for this is, or if there’s something I’m missing?

fehiepsi · March 12, 2024, 12:15am

Could you try using float64 in those computations?

student_12 · March 16, 2024, 8:37pm

@fehiepsi This did fix the issue, thanks!