Survival analysis

haoyangz · August 15, 2018, 3:30am

I wish to implement the same survival model as described here. In short, we need to model the sample differently depending on their target values (censored sample vs. uncensored sample). For the uncensored samples we model with a Gumbel distribution. For the censored samples, we calculate p(Y>y) under Gumbel distribution. The model needs to optimize both (the Gumbel distribution and the p(Y>y) likelihood).

In pymc3, it is implemented as below:

cens = df.event.values == 0.
cens_ = shared(cens)

def gumbel_sf(y, μ, σ):
    return 1. - tt.exp(-tt.exp(-(y - μ) / σ))


with weibull_model:

    # for uncensored samples
    y_obs = pm.Gumbel(
        'y_obs', η[~cens_], s,
        observed=y_std[~cens]
    )

    # for censored samples
    y_cens = pm.Potential(
        'y_cens', gumbel_sf(y_std[cens], η[cens_], s)
    )

My questions are:

to model the data with more than one distribution, what should we do in model() and guide() ?
is there a way to fit a likelihood function, like pm.Potential in pymc3?