Hello,
So I have two observations (matrices X
and Y
) in the model that are all generated from an underlying latent variable let’s call it sigma
. So the model looks like something below:
def model(X,Y):
sigma = pyro.sample('sigma',some_dist(...))
with pyro.plate('data_X'):
X = pyro.sample('X',some_dist(somehow related to sigma))
with pyro.plate('data_Y'):
Y = pyro.sample('Y',some_dist(somehow related to sigma))
When training the model using only X
and Y
, the inferred sigma
all make sense which I think suggests the way I define the model might be decent. However, when using both X
and Y
, the inferred sigma
is dominated by X
, it seems that the observation Y
doesn’t affect the inference too much.
By further looking at the scale of the loss function, I found out that the loss function for observation X
is like 3 order of magnitude larger than the loss function for observation Y
.
So, I rescale my observation Y
by multiplying a factor 10,000, and now the effect of Y
becomes visible and I am happy with the results. But I think choosing an arbitrary 10,000 is a bit too ad-hoc, so I wonder is there any recommendations for how to better assign weights to different observations, hopefully in a more intelligent (less subjective) way?
I understand there’s a poutine.scale
function that it seems to operate on the whole model
, would it be possible to operate on certain sample sites instead?
Thanks a lot,
Frank