SVI for Rational Speech Act

Within the many excellent Pyro RSA examples and tutorials, none appear to be using the Pyro’s distinctive feature – SVI.

I would be interested in applying SVI to RSA, however, I’m unsure what problem would have sufficiently computationally complex models, that would require a learned guide. Could anyone please suggest an RSA problem that would be a good candidate for SVI?

Can you be a bit more specific about what you mean by this, and about your ultimate goal? Do you have a particular problem or research question in mind for which RSA is an appropriate formalism?

At the moment I’m trying to better understand the capabilities of both RSA and Pyro. I think that the current example implementations of RSA do not take full advantage of Pyro in terms of the stochastic optimization functionality and I am curious about how RSA paradigms could take advantage of such functionality.

Hmm, sorry, I’m still not sure I understand what you mean by “capabilties” of RSA. I would also point out that all of the Pyro RSA examples except the semantic parsing one are extremely simple and use exact inference, so I’m not sure why you would want to use approximate inference like SVI in them instead.

RSA is a theoretical model of how people understand language as the product of goal-directed, parsimonious agents reasoning about other agents’ communicative goals and mental states. It’s primarily been successful as a unifying theoretical explanation for a number of seemingly disparate linguistic phenomena that don’t fit neatly into compositional semantics, like hyperbole or scalar implicature.

The best way to understand that success is by reading cognitive science review papers like this one. This recent CogSci paper is one of the only instances I am aware of that examines the interaction of RSA and approximate inference.

You might also be interested in reading about closely related models of theory of mind, which have a similar flavor of Bayesian agents reasoning recursively about the intentions of others. This paper discusses an interesting example of this kind of model in more standard machine learning terminology.