Thanks @fehiepsi , that makes sense!
Do you have a recommendation for how to wrap a linear regression model in a plate messenger so that predictive samples can be generated in parallel? In the Bayesian Regression tutorial, it says that
We generate 800 samples from our trained model. Internally, this is done by first generating samples for the unobserved sites in the
guide
, and then running the model forward by conditioning the sites to values sampled from theguide
. Refer to the Model Serving section for insight on how thePredictive
class works.
For my bayesian linear regression model, I’d like to generate 1000 predictive samples from the trained model for each new input example. I’m not sure if serving the model via TorchScript would improve the time it takes to generate these samples. I’ve been trying to follow the recommendations in this forum post: “When parallel=False
, Predictive
has to run your model once per sample, which as you are seeing will be very slow for large numbers of samples.” But I can’t seem to get this working with a simple linear model.
I’m just trying to find an example of how to wrap a linear regression model in an outermost plate
messenger to take advantage of the ‘parallel=True’ functionality in the ‘Predictive’ class. If you have any other suggestions or examples I’d greatly appreciate it. For now, I might just try writing my own linear model (as in the second part of the tutorial) to see if that works with the ‘parallel=True’ functionality.