Steins Variational Gradient Descent: User Experience?


I have a general question. I saw that the Stein Variational Gradient Descent algorithm was implemented in Pyro. I’m familiar with it but only from a theoretical perspective as I’ve never used it in practice.

So, I wanted to know if this optimizer has impacted some algorithms that people have used in this library; be it positive or negative way. Maybe an example use case of success? Also, kernel methods tend to be a bit difficult when there is no clear way to choose the kernel parameters. Has that affected the results as well?

Perhaps this question may be better directed at @martinjankowiak or @fritzo because I believe they were the ones who implemented it.


i haven’t had much luck using it except for in toy examples. in my experience it seems the particles have a tendency to get stuck in bad optima. i suspect performance will be pretty problem specific. and that we need better algorithms for optimizing interacting particle systems.

1 Like