Survival Models


生存模型(Survival Models)属于General Linear Model, 被广泛用于Censored Data的建模, 譬如用户流失预测. 这里介绍下最基本的生存模型以及在Censored Data上的MLE估计

Survival Function

Assume $T$ is a continuous random variable indicates the death occurrence time, we have:

$$
F(t) = P\lbrace T < t\rbrace = \int_0^t f(t) dt
\tag{1.1}
$$

Then the Survival Function should be:

$$
S(t) = P\lbrace T > t\rbrace = 1 - F(t) = \int_t^\infty f(t) dt
\tag{1.2}
$$

Harzard Function

An alternative way to characterization the distribution is given by harzard function, or instantaneous rate of occurrence of the event:

$$
\begin{align}
\lambda(t) &= \lim_{dt \to 0} \frac{P\lbrace t \le T < t + dt | T \ge t\rbrace}{dt} \\
&= \lim_{dt \to 0} \frac{P\lbrace t \le T < t + dt \rbrace}{P \lbrace T \ge t\rbrace dt} \\
&= \lim_{dt \to 0} \frac{f(t)dt}{S(t) dt} \\
&= \frac{f(t)}{S(t)}
\end{align}
\tag{2.1}
$$

Given $(1.2)$ we have $\frac{d}{dt} S(t) = -f(t)$, so $(2.1)$ has another form

$$
\lambda(t) = -\frac{d}{dt} log S(t)
\tag{2.2}
$$

We could derive survival function from harzard function as well:

$$
S(t) = exp\lbrace - \int_0^t \lambda(x)dx \rbrace = exp\lbrace -\Lambda(t) \rbrace
\tag{2.3}
$$

In which $\Lambda(t) = \int_0^t \lambda(x)dx$, called cumulative hazard


Example 2.1

Here we’re modeling a constant risk over time:
$$
\lambda(t) = \lambda
$$
From $(2.2)$, we could solve corresponding survival function and pdf
$$
\begin{align}
S(t) &= exp\lbrace - \int_0^t \lambda(x)dx \rbrace = e^{-\lambda t} \\
f(t) &= \lambda e^{-\lambda t}
\end{align}
$$
That is exactly an exponential distribution


Expectation of Life

Given $S(t)$ or $\lambda(t)$, it’s easy to denote expected value of $T$
$$
\mu = \int_0^\infty tf(t)dt =\int_0^\infty S(t)dt
$$

Censoring and the likelihood function

Censoring Type

  1. Type I
    Typically 2 types of observatioin:

    • A sample of $n$ units is followed for a fixed time $\tau$
    • Generalization, fixed censoring: each unit has a fixed time $\tau_i$

    In cases above, number of deaths is a random variable.

  2. Type II

    • A sample of $n$ units is followed as long as necessary until $d$ units have experienced the event
    • Generalization, random censoring: Each unit has:
      • Censoring time $C_i$
      • Potential lifetime $T_i$
      • Observe time $Y_i = min\lbrace C_i, T_i\rbrace$
      • Indicator $d_i, \delta_i$ tells us whether the observation is terminated by death or censoring

Likelihood of censoring model

  1. Unit died at $t_i$. Since we know it is dead while survives till $t_i$, we have:
    $$
    L_i = f(t_i) = S(t_i)\lambda(t_i)
    \tag{3.1}
    $$

  2. Unit still alive at $t_i$. We only know it survives till $t_i$
    $$
    L_i = f(t_i) = S(t_i)
    \tag{3.2}
    $$

Given 2 conditions above, we have:
$$
L = \prod\limits_{i=1}^{n}L_i = \prod\limits_{i} \lambda(t_i)^{d_i}S(t_i)
\tag{3.3}
$$
Taking logs, considering $(2.3)$, we have:
$$
log L = \sum\limits_{i=1}^{n} \lbrace d_ilog\lambda(t_i) - \Lambda(t_i) \rbrace
\tag{3.4}
$$


Example 3.1

Considering exponential distribution $\lambda(t) = \lambda$, from$(3.4)$, we have
$$
log L = \sum\limits_{i=1}^{n} \lbrace d_ilog\lambda - \lambda t_i \rbrace
$$

We could estimate $\lambda$ using MLE:

Let $D=\sum d_i$ denotes the total number of deaths, $T = \sum t_i$ denotes total number of observation time:

$$
\begin{align}
log L &= Dlog\lambda - T\lambda \\
\frac{\partial}{\partial \lambda} L &= \frac{D}{\lambda} - T
\end{align}
$$

Letting $\frac{\partial}{\partial \lambda} L = 0$ we get the estimation of $\lambda$

$$
\hat \lambda = \frac{D}{T}
$$


Reference