Uncertainty in bayesian regression


#1

Hi All,

I’ve just started using Pyro and would like to know how get the uncertainty for a prediction in regression.
Using the tutorial http://pyro.ai/examples/bayesian_regression.html with the final snippet of code as follows how can you get the uncertainty (std/probability) for each prediction?

for i in range(20):
    # guide does not require the data
    sampled_reg_model = guide(None)
    # run the regression model and add prediction to total
    y_preds = y_preds + sampled_reg_model(x_data)

# take the average of the predictions
y_preds = y_preds / 20

Thanks.


#2

running the model forward takes one sample from the posterior. since the distribution is parameterized, you can just just look in the param store. something like: pyro.param('name_of_std_param')


#3

Hi JP,

The following values are in the param store (Boston Housing data - 13 features). How do you calculate the uncertainty for each prediction based on this?

guide_mean_weight
[[-0.38087404 0.5563026 0.3670732 -0.4159602 -2.1658132 -0.96075886
-0.42535275 -0.1035834 0.40828663 1.8544632 -0.74433875 0.14294197
0.19255933]]
guide_log_scale_weight
[[-2.9694195 -3.0218732 -2.9446466 -3.020102 -3.0388255 -2.978962
-2.9183261 -3.0018811 -3.0467377 -3.0938332 -2.8980458 -2.9239552
-3.0365248]]
guide_mean_bias
[0.33985868]
guide_log_scale_bias
[-2.9375718]

Thanks,

Ronan


#4

i think you might be confusing different things. the uncertainty is over parameters not outputs. a bayesian nn is a distribution over nns, so you sample an nn which give a point estimate given data. it seems like you’re looking for something like a GP?


#5

[quote=“jpchen, post:4, topic:425”]
give a point estimate given data. it seems like you’re looking for something like a GP?
[/quote] The uncertainty related to each point prediction is given by the standard deviation (or entropy) of 20 different predictions you got from sampling the weights of you neural network, isn’t it ?
Yes, i agree that GP is an other way to get these uncertainty information.


#6

oh sure, you can calculate the empirical std dev by sampling a bunch of nns and running on the same data point. but then at that point i think there are better models to use to get what you want.


#7

thank you for the answer. By better models are you talking about GP models ?


#8

Hi Guys,

Thanks for the comments and sorry for the late reply. Yeah perhaps a Gaussian Process is a better fit for measuring the uncertainty in each new prediction.

Excuse my ignorance to Bayesian methods but in the bayesian regression tutorial http://pyro.ai/examples/bayesian_regression.html4 which uses one linear layer (linear regression) so it will result in one set of weights / distribution of weights after training.
For a new test input is it possible to multiple the inputs by each weight / distribution to output a prediction with uncertainty?


#9

sorry i don’t understand what you’re asking. what does multiply the inputs by each weight mean? the uncertainty learned in the tutorial is the global uncertainty, not local to each data point.


#10

Hi @jpchen, re my comment “multiply the inputs by each weight / distribution”.

sample test data

feature_1, feature_2, feature_3
12, 0, 19
2, 15, 0

Least squares regression
In a regular regression problem once the model is trained and to make a prediction on new inputs you would multiply each feature by the corresponding weight and sum the values to output the final prediction for a given sample input.

weights (after training a model): w1=7, w2=18, w3=17

Multiplying the the weights by the the features for each input data sample will output a prediction (without any uncertainty information)

Probabilistic Linear Regression
weights (after training a model - linear model so only one set of weights):
w1_mean=7, w1_std=1
w2_mean=18, w2_std=4,
w3_mean=17, w3_std=1

Now that you have the mean value of each weight and standard deviation can you use this information to calculate the uncertainty of a prediction on the test data? i.e w2 has a large standard deviation so any input data with high values for this feature would result in high uncertainty in the prediction.

The second sample test input above has a value of 15 for feature_2 so maybe this would result in high uncertainty in the test prediction.

The first sample has a value of 0 for feature_2 so maybe this would result in a more confident prediction as the other two features effecting the prediction have a low standard deviation?

This is what I mean by using the mean and std of each weight to calculate the uncertainty specific to each new test input.

Does this make sense? :slight_smile: