Custom distribution for SVI and sampling

I am trying to model some events that are asymptotically Poisson/Laplace distributed (i.e. the pmf/pdf has an exponential tail), but for short time periods has an interesting weekly periodic structure:

I have had difficulty modeling this using standard distributions because of its unique shape, and so I was considering implementing a custom distribution that could capture these peaks on short time scales, while also fitting the long tail.

I have a number of different ideas for how to formulate a pdf/pmf that might do the trick (exponential baseline with finite number of Gaussian bumps, or categorical distribution below some threshold and Poisson above it, etc.), and they are all relatively easy to implement `log_prob` for, but then I realized that I might also need to implement a sampler, which for a distribution of this complexity is probably very involved and nontrivial! I know that for HMC the `sample` method for a `Distribution` is only called to initialize the first point so it is not needed for inference, but I want to use SVI because my dataset has millions of examples and will be very slow otherwise.

So my question is: do I need to implement a sampler to do inference with SVI? On the face of things I would expect the answer to be no because this site will be observed, so I think I would only need to implement `log_prob`, but Iâ€™m not 100% sure. Also, does the `log_prob` need to be normalized, or can I used an unnormalized distribution?

Bonus question: is there any way I can use standard torch/pyro distributions to fit this distribution? Perhaps some kind of mixture model?

if youâ€™re using a distribution in `SVI` on the model side all thatâ€™s really required is the `log_prob` since all samples come from the guide. the `log_prob` doesnâ€™t need to be normalized but it does need to reflect any dependence on the parameters in the log normalizer. e.g. the two pi in the `log_prob` of a normal distribution doesnâ€™t matter for the purposes of ELBO training but the log scale certainly does.

Thanks for the quick response! So if I understand correctly, if I wanted to use this distribution in the guide, I would need to implement `sample`, however since it is an observed site it doesnâ€™t need to be present in the guide, and therefore `sample` is not needed, only `log_prob`. And that makes sense about the normalization, thank you. Iâ€™m pretty sure my normalization â€śconstantâ€ť will end up having parameters in it.

Hi @physicswizard, one way you could model this distribution is with a more complex probabilistic model. Hereâ€™s an example that splits `days` into week and day-of-week components.

``````def model(days):
# Split data into weekly and day-of-week.
assert days.dtype == torch.float
weeks = days // 7
day_of_week = days % 7

# Model these two as a conditional distribution.
prob = pyro.param("prob", torch.tensor(1.),
constraint=constraints.unit_interval)
pyro.sample("weeks", dist.Geometric(prob))  # or maybe something richer

probs0 = pyro.param("probs0", torch.ones(7) / 7,
constraint=constraints.simplex)
probs1 = pyro.param("probs1", torch.ones(7) / 7,
constraint=constraints.simplex)
timescale = pyro.param("timescale", torch.tensor(7.),
constraint=constraints.positive)
probs = probs0 + (probs1 - probs0) * (days / timescale).neg().exp()
pyro.sample("day_of_week", dist.Categorical(probs),
obs=day_of_week)
``````
1 Like

@fritzo that is a great idea. I have found in my discussions with others that these events do tend to occur on specific days of the week for different subgroups, so this sounds like a good fit.