Probabilistic model with large matrix multiplication inside

Hi all,

I’m trying to see if pyro would be good for solving inverse models degraded by noise. Specifically problems of the form

F = Hg + n

Where f is a recorded signal, g is the true object n is noise and H is is some corrupting measurement matrix.

As I think I understand it, using pyro I can build a forward model of HG + n and it’ll use pytorch backpropogation magic to iterate through to maximize likihood say. My question is will pyro handle this properly if H is a large matrix with only pseudo inverses available? Is large matrix inversion something that pytorch itself handles nicely?



pytorch has a pinverse method but i’ve personally never used it

why do you need to invert H?

I think I’d misunderstood quite how pyro works and that I assumed that pytorch would invert H rather than optimise a loss akin to Hg-f=0.

i’m assuming H is known?

pyro supports various inference algorithms. for the simplest ones to try on this sort of problem, namely MAP/MLE and SVI you wouldn’t need to compute H’s (pseudo)inverse.

roughly speaking your model would look like

def model(F, H):
    g = pyro.sample("g", GPriorDistribution(...))
    Hg = torch.matmul(H, g)
    pyro.sample("F", NoiseDistribution(Hg, ...), obs=F)

you’d just need to fill in distributions appropriate to your problem

H is known yes, and each entry in the g vector is an independent Poisson distribution.

that is potentially a difficult problem since g is discrete (will depend on dimensionality etc.) but you could certainly give SVI a try. if the dimension is relatively moderate it could work

H is a rectangular sparse matrix (mostly zeros usually) up to 400000 square and g is a 1d vector up to 400000 in length

yeah that’s not an easy problem to solve, especially with black box methods. what is the magnitude of typical counts g you expect to encounter? can you replace the poisson distribution with a continuous relaxation? that’d make inference much easier

The problem I’m inverting is image formation specifically for microscopy. I was hoping to build the model up in complexity to include a time component (using markov chains) aswell rather than simplify it.

H is (most of the time) a toeplitz matrix, and so under most conditions the H*g bit is a convolution with a small kernel (which would help with computation I guess?)

I was also broadly assuming that the sparsity of H would help as it usually masks out contributions from all but a subset of nearby pixels to the pixel currently being observed.

In this case we use a Poisson distribution because thats physically representative of photons being emited and received at the detector. There are known iterative image reconstruction techniques that do maximum likelihood estimation (Richardson lucy deconvolution), but they mostly fall down when you add the time element into the model.

I’m definitely going to play around and figure out where the limits of computational viability come in, but it is accepted that large image deconvolution can be very slow.

Thanks for your guidance so far, you’ve been super helpful.

one thing you could do is

  • truncate the poisson distribution at some N_max (and renormalize)
  • then use something like RelaxedOneHotCategorical in conjunction with your truncated Poisson distribution to define a variational distribution
  • then do variational inference

basically you’d be using a continuous relaxation to compute approximate gradients but you’d still be using the desired poisson distribution in the model (though it’d be truncated).

still, this would be an approximation and it’s not clear a priori how well it’d work. perhaps poorly, hard to say.