Sharing params between model & guide (i.e. tied weights)?

msegado · March 7, 2023, 12:50am

Is there an established way to share parameter values between the model and guide when using SVI in numpyro? (This is often called “weight tying” in the context of autoencoders; see e.g. Building an Autoencoder with Tied Weights in Keras | by Laurence Mayrand-Provencher | Medium)

I can imagine a roundabout method which involves sampling a parametrized delta distribution in the guide and using a matching sample site to access the value in the model, but that feels like a bit of a hack…

Thanks!

ordabayev · March 7, 2023, 3:07am

I believe you can access the same parameter from both the model and the guide using numpyro.param("name").

msegado · March 7, 2023, 4:29pm

Oh wow, that was simpler than I thought! I had the impression that only sample site values from the guide were replayed onto the model, but it seems that params somehow make it across as well. (Not sure I understand where that actually happens in the SVI source but I’m glad to see it “just works”…)

Thanks for the help =)

PS: For anyone trying to use this: you do still need to provide an initial value or initializer function for the param statement in the model (otherwise it defaults to None during tracing and breaks any downstream code that uses it), but it seems that the param statement used during the actual inference procedure is the one in the guide.

fehiepsi · March 7, 2023, 5:18pm

otherwise it defaults to None during tracing and breaks any downstream code that uses it

Could you create a FR for this? Or make a small pull request to allow replacing params in replay handler. Currently, in SVI, we haven’t replayed parameters yet. Thanks!

msegado · March 7, 2023, 6:10pm

Can definitely create a FR, but I’m not quite sure what the desired behavior should be here. If we want to be able to run the model without a guide, it’ll need an initializer as it currently does - unless you want to think of more explicit ways to handle shared params?

Yes, I saw that parameters aren’t replayed, but somehow the model ends up sharing the value from the guide regardless if it has the same name. I’m guessing it works its way over via the messenger stack, but haven’t dug into the source enough to see where that happens…

fehiepsi · March 7, 2023, 6:37pm

The issue just happens in the initialization step. During svi updates, the logic is fine, I think.

If we want to be able to run the model without a guide, it’ll need an initializer as it currently does

You are right. It’s needed if we want to draw samples with init parameters. But I guess in practice, we always want to draw samples with optimized parameters. The desired behavior in SVI is to avoid initializing model’s params when users already specify them in the guide.

msegado · March 7, 2023, 8:15pm

The desired behavior in SVI is to avoid initializing model’s params when users already specify them in the guide.

Got it. I’ll take a look later this afternoon and try to put something together.

msegado · March 8, 2023, 1:47am

github.com/pyro-ppl/numpyro

[FR] SVI: avoid initializing model params when already specified in guide

opened 01:47AM - 08 Mar 23 UTC

msegado

(Context: https://forum.pyro.ai/t/sharing-params-between-model-guide-i-e-tied-we…ights/5033/) Sharing parameter values between a model and its guide is helpful in some applications, e.g. symmetric autoencoders. This can *almost* be accomplished by defining a param as usual in the guide and calling `numypro.param("shared_site_name")` in the model, but [since `SVI` only replays sample values](https://github.com/pyro-ppl/numpyro/blob/737d7d9ec0dc5d6649f85bbedddf28ca07908e20/numpyro/infer/svi.py#L186-L187) and not parameter values before tracing the model, the return value of this `param()` call defaults to `None` which breaks any downstream code that uses it. The [desired behavior in SVI is to avoid initializing any model parameters already specified in the guide](https://forum.pyro.ai/t/sharing-params-between-model-guide-i-e-tied-weights/5033/6). I can see a couple of ways of accomplishing this: 1. Explicitly substitute initialized parameter values from the guide before tracing the model. Here's a strawman implementation which uses `substitute()` to avoid changing the semantics of `replay()`: https://github.com/pyro-ppl/numpyro/compare/master...msegado:numpyro:feat-shared-param-init. 2. Change the `replay()` handler to support replaying param values. This should probably be opt-in via an `include_params` kwarg or similar to avoid breaking existing user code, and would also require a change to the docs. The first seems like the more natural approach to me, but I figured it would be good to ask before opening a PR 🙂 Thoughts?