Uniform Distribution with AutoGuide shows loc, scale?

pumplerod · September 7, 2023, 9:07pm

I’m a little confused by the parameters generated from the AutoGuide when I use pyro.distributions.Uniform( low_val, high_val)

it seems to show a loc and scale parameter being generated which is what I would expect ( and do see) when using pyro.distributions.Normal( mu, sigma)

here is my example with pryo,distributions.Uniform( low_val, high_val):

The Linear NN Layer Class used…

class PyroLinear(torch.nn.Linear, pyro.nn.PyroModule):  # used as a mixin
    def __init__(self, in_features, out_features, device='cpu', **kwargs):
        super().__init__(in_features, out_features, **kwargs)

        ##-- Kaiming Uniform Distribution
        val = torch.Tensor([np.sqrt(6 / in_features)]).type( torch.float32).to(device)
        w_val = torch.ones_like( self.weight, device=device) * val
        self.weight = pyro.nn.PyroSample( pyro.distributions.Uniform( -w_val, w_val).to_event( 2))
        if self.bias is not None:
            b_val = torch.ones_like( self.bias, device=device) * val
            self.bias = pyro.nn.PyroSample( pyro.distributions.Uniform( -b_val, b_val).to_event( 1))
            
    @property
    def device(self):
        if len( list(self.parameters())) > 0:
            return next(self.parameters()).device
        else:
            return 'cpu'

The Bayesian NN model…

class BNNRegressor(pyro.nn.PyroModule):
    def __init__(self, input_size: int, latent_size: int, norm_scale=0.01, device='cpu'):
        super().__init__()
        self.latent_size = latent_size
        self.bl1 = PyroLinear(input_size, self.latent_size, device=DEVICE)
        self.bl2 = PyroLinear(latent_size, 2, device=DEVICE)
        self.norm_scale = norm_scale

    def forward(self, x: torch.Tensor, y_true=None):
        probs = torch.nn.functional.softmax( self.bl2( torch.nn.functional.layer_norm( torch.nn.functional.tanh( self.bl1( x)), (x.size(0), self.latent_size))), dim=1)
        preds = (probs[:,1] - probs[:,0]) * 0.5 + 0.5
        
        with pyro.plate('batch', x.shape[0]):
            if self.training:
                pyro.sample('pred_sample', pyro.distributions.Normal( loc=preds, scale=torch.tensor(self.norm_scale).to( x.device)), obs=y_true)
            else:
                pyro.deterministic( 'preds', preds)

Creating Model and Guide…

pyro.clear_param_store()
pyro.set_rng_seed(1618)
bnn_model = BNNRegressor( input_size=9, latent_size=500, norm_scale=0.01, device='cuda:1')
bnn_guide = pyro.infer.autoguide.AutoNormal( bnn_model)

For some reason I have to create an SVI instance and run a step before I can see any of the parameters. Is there another way to see the guide?

features = torch.randn( [100,9], device='cuda:1')
targets = torch.rand([100], device='cuda:1')

learn_rate = 5e-3
optimizer = torch.optim.Adam
scheduler = pyro.optim.ReduceLROnPlateau( {"optimizer":optimizer, "optim_args": {"lr": learn_rate}, "factor":0.9, "patience":250, "verbose":False})

svi = pyro.infer.SVI(bnn_model, bnn_guide, scheduler, loss=pyro.infer.Trace_ELBO())

run a step for svi…

svi.step( features, targets)

look at what parameters were created…

[ p for p in pyro.get_param_store()]

yields:

['AutoNormal.locs.bl1.weight',
 'AutoNormal.scales.bl1.weight',
 'AutoNormal.locs.bl1.bias',
 'AutoNormal.scales.bl1.bias',
 'AutoNormal.locs.bl2.weight',
 'AutoNormal.scales.bl2.weight',
 'AutoNormal.locs.bl2.bias',
 'AutoNormal.scales.bl2.bias']

Shouldn’t AutoNormal be creating something like AutoNormal.lows.bl1.weight and AutoNormal.highs.bl1.weight?

What is it actually learning to optimize?