Objectives
In this exercise, we run the Simulated Annealing ABC (SABC), for a Bayesian inference of the parameter of the model “survival”, with a flat prior. Note, in practice, we would not want to use an ABC algorithm for this problem. Why?
For ABC, we need to define the prior density and a function that generates random outputs from our model \(p(y|\theta)\). That means we are sampling from the probabilistic model. Note however, that we do not need a function which evaluates the associated density, i.e., the likelihood.
We use the Python package SimulatedAnnealingABC,
which implements the SABC algorithm.
from pathlib import Path
import os
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.special import gammaln
from simulated_annealing_abc import (
DifferentialEvolution,
SABCConfig,
make_f_dist,
sabc,
)
As a next step, let’s read the data that we observed in the form of
death counts per observation interval, and which we try to explain by
the model survival. The experiment was started with N=100
individuals.
obs = pd.read_csv( "../../data/model_survival.csv", sep=r"\s+")
times = obs["time"].to_numpy(dtype=float)
deaths_obs = obs["deaths"].to_numpy(dtype=int)
For ABC, we always need a function that evaluates the prior density, a function that samples from the prior, and a function that generates random model outputs (our model). The functions that we need are defined in the following:
SEED = 20260525
N_INDIVIDUALS = 100
class Prior:
def __init__(self, lower: float = 0.01, upper: float = 0.6):
self.lower = lower
self.upper = upper
def rvs(self, rng: np.random.Generator, size: int = 1) -> np.ndarray:
return rng.uniform(self.lower, self.upper, size=(size, 1))
def logpdf(self, theta: np.ndarray) -> np.ndarray:
theta = np.atleast_2d(theta)
lam = theta[:, 0]
in_bounds = (self.lower <= lam) & (lam <= self.upper)
lp = np.full(theta.shape[0], -np.inf)
lp[in_bounds] = -np.log(self.upper - self.lower)
return lp
prior = Prior()
def death_probabilities(lam: np.ndarray) -> np.ndarray:
lam = np.asarray(lam, dtype=float).reshape(-1, 1)
survival = np.exp(-lam * times.reshape(1, -1))
survival = np.column_stack([np.ones(lam.shape[0]), survival, np.zeros(lam.shape[0])])
return -np.diff(survival, axis=1)
def simulator(theta: np.ndarray, y: np.ndarray, rng: np.random.Generator) -> None:
probs = death_probabilities(theta[:, 0])
for i, p in enumerate(probs):
y[i, :] = rng.multinomial(N_INDIVIDUALS, p)[:-1]
We use the Julia package SimulatedAnnealingABC.jl,
which implements the SABC algorithm. We also need packages for
distributions and data handling.
using Random
using SimulatedAnnealingABC
using Distributions
using CSV
using DataFrames
using AdaptiveMCMC
using MCMCChains
ENV["GKSwstype"] = "100"
## "100"
using Plots
using StatsPlots
Next, we load the observed death counts. The experiment started with N=100 individuals.
obs = CSV.read("../../data/model_survival.csv", DataFrame)
## 30×2 DataFrame
## Row │ time deaths
## │ Int64 Int64
## ─────┼───────────────
## 1 │ 1 14
## 2 │ 2 10
## 3 │ 3 11
## 4 │ 4 12
## 5 │ 5 3
## 6 │ 6 6
## 7 │ 7 4
## 8 │ 8 7
## ⋮ │ ⋮ ⋮
## 24 │ 24 0
## 25 │ 25 0
## 26 │ 26 0
## 27 │ 27 0
## 28 │ 28 1
## 29 │ 29 0
## 30 │ 30 0
## 15 rows omitted
For ABC, we need a prior distribution, a function to sample from it, and a model function that generates random outputs.
prior = Uniform(0.01, 0.6)
## Uniform{Float64}(a=0.01, b=0.6)
d_prior(x) = pdf(prior, x)
## d_prior (generic function with 1 method)
r_prior() = rand(prior)
## r_prior (generic function with 1 method)
function survival_sampler(λ; N=100, t=obs.time)
S = exp.(-λ .* t)
S = [1; S; 0]
p = -diff(S)
rand(Multinomial(N, p))[1:(end-1)]
end
## survival_sampler (generic function with 1 method)
For ABC we need a distance measure to compare the model output to the observations, in order to judge the plausibility of the parameters that generated this model output. Here we do not apply summary statistics, we compare the model output and the observations directly.
Use the sum of the absolute distances as distance function.
def raw_stats(y: np.ndarray, ss_out: np.ndarray) -> None:
ss_out[:, :] = y
ss_obs_raw = deaths_obs.astype(float)
f_dist_raw = make_f_dist(
n_samples=len(deaths_obs),
ss_obs=ss_obs_raw,
simulator=simulator,
stats_fn=raw_stats,
seed=123,
distance="abs",
)
Now we can run SABC using f_dist_raw as distance
function:
raw_config = SABCConfig(
f_dist=f_dist_raw,
prior=prior,
n_particles = 2000,
v=1.2,
algorithm="single_eps",
proposal=DifferentialEvolution(n_para=1, rng=np.random.default_rng(22)),
rng=np.random.default_rng(18),
show_checkpoint=100,
)
raw_result = sabc(raw_config, n_simulation=1_000_000)
## Initialization done, starting population updates
## Update 100/499 avg_u=0.1278 eps=[0.0508] ETA (DD:HH:MM)=00:00:00
## Update 200/499 avg_u=0.1191 eps=[0.0465] ETA (DD:HH:MM)=00:00:00
## Update 300/499 avg_u=0.1159 eps=[0.0449] ETA (DD:HH:MM)=00:00:00
## Update 400/499 avg_u=0.1139 eps=[0.0439] ETA (DD:HH:MM)=00:00:00
## Update 499/499 avg_u=0.1123 eps=[0.0431] ETA (DD:HH:MM)=00:00:00
We define a distance function that calculates the sum of absolute differences between the model output and the observations.
f_dist(par, obs) = Float64.(abs.(survival_sampler(par) .- obs.deaths))
## f_dist (generic function with 1 method)
Now, we can run the SABC algorithm using this distance function.
sabc_res = sabc(f_dist, prior, obs;
n_simulation = 1_000_000,
n_particles = 2000)
## Approximate posterior sample with 2000 particles:
## - algorithm: :single_eps
## - simulations used: 1000000
## - number of population updates: 499
## - average transformed distance: 0.0628
## - ϵ: [0.02273]
## - number of population resamplings: 3
## - acceptance rate: 0.01225
## The sample can be accessed with the field `population`.
## The history of ϵ can be accessed with the field `state.ϵ_history`.
## The history of ρ can be accessed with the field `state.ρ_history`.
## The history of u can be accessed with the field `state.u_history`.
This simple model has a set of two sufficient statistics (even though there is just one parameter!). You can try to derive them!
Definition of the summery statistics and the distance function:
def sufficient_stats(y: np.ndarray, ss_out: np.ndarray) -> None:
weights = np.arange(1, y.shape[1] + 1)
ss_out[:, 0] = np.sum(y, axis=1)
ss_out[:, 1] = y @ weights
ss_obs_suff = np.array([np.sum(deaths_obs), deaths_obs @ np.arange(1, len(deaths_obs) + 1)])
f_dist_suff = make_f_dist(
n_samples=len(deaths_obs),
ss_obs=ss_obs_suff,
simulator=simulator,
stats_fn=sufficient_stats,
seed=456,
distance="abs",
)
suff_config = SABCConfig(
f_dist=f_dist_suff,
prior=prior,
n_particles = 2000,
v=1.2,
algorithm="single_eps",
proposal=DifferentialEvolution(n_para=1, rng=np.random.default_rng(23)),
rng=np.random.default_rng(19),
show_checkpoint=100,
)
suff_result = sabc(suff_config, n_simulation=1_000_000)
## Initialization done, starting population updates
## Update 100/499 avg_u=0.01652 eps=[0.0036] ETA (DD:HH:MM)=00:00:00
## Update 200/499 avg_u=0.01131 eps=[0.0022] ETA (DD:HH:MM)=00:00:00
## Update 300/499 avg_u=0.008787 eps=[0.0016] ETA (DD:HH:MM)=00:00:00
## Update 400/499 avg_u=0.006785 eps=[0.0011] ETA (DD:HH:MM)=00:00:00
## Update 499/499 avg_u=0.005877 eps=[0.0009] ETA (DD:HH:MM)=00:00:00
Definition of the summery statistics and the distance function:
ss(y) = (sum(y), sum(y .* (1:length(y))))
## ss (generic function with 1 method)
f_dist_suff(par, obs) = Float64.(abs.(ss(survival_sampler(par)) .- ss(obs.deaths)))
## f_dist_suff (generic function with 1 method)
sabc_suff_res = sabc(f_dist_suff, prior, obs;
n_simulation = 1_000_000,
n_particles = 2000)
## Approximate posterior sample with 2000 particles:
## - algorithm: :single_eps
## - simulations used: 1000000
## - number of population updates: 499
## - average transformed distance: 0.001411
## - ϵ: [0.000157]
## - number of population resamplings: 4
## - acceptance rate: 0.01989
## The sample can be accessed with the field `population`.
## The history of ϵ can be accessed with the field `state.ϵ_history`.
## The history of ρ can be accessed with the field `state.ρ_history`.
## The history of u can be accessed with the field `state.u_history`.
For this model we do not need to use ABC (and should not!) because we can evaluate the likelihood function, hence we can sample from the posterior with standard MCMC methods. This allows us to compare the two samples generated in the previous tasks to a sample of the “true posterior”.
Let us now compare the posterior samples generated with ABC with and without summary statistics, and the sample of the true posterior. In the first figure below, you can see that not using summary statistics can lead to slow convergence: many proposals will be rejected since in high dimensions it is very unlikely to match the data with sufficient accuracy. However, the sample will eventually converge to the true distribution. The figure also shows that using two sufficient statistics leads to a fast convergence to an unbiased result.
To generate a sample of the true posterior, we first define the posterior function (see Monday excerises):
def multinomial_logpmf(y: np.ndarray, p: np.ndarray) -> float:
return float(
gammaln(N_INDIVIDUALS + 1)
- np.sum(gammaln(y + 1))
+ np.sum(y * np.log(p))
)
def logposterior(lam: float) -> float:
if not (prior.lower <= lam <= prior.upper):
return -np.inf
probs = death_probabilities(np.array([lam]))[0]
y = np.append(deaths_obs, N_INDIVIDUALS - np.sum(deaths_obs))
return multinomial_logpmf(y, probs) - np.log(prior.upper - prior.lower)
In the following, we use an adaptive Metropolis algorithm to generate a sample from the true posterior:
def run_metropolis(start: float = 0.3,
proposal_sd: float = 0.01,
n_iter: int = 20_000,
burn_in: int = 2_000,
seed: int = SEED):
rng = np.random.default_rng(seed)
chain = np.empty(n_iter)
current = start
current_lp = logposterior(current)
for i in range(n_iter):
proposed = current + rng.normal(0.0, proposal_sd)
proposed_lp = logposterior(proposed)
if np.log(rng.uniform()) < proposed_lp - current_lp:
current = proposed
current_lp = proposed_lp
chain[i] = current
return chain[burn_in:]
Plotting the samples:
reference = run_metropolis()
raw_pop = raw_result.population[:, 0]
suff_pop = suff_result.population[:, 0]
plt.figure(figsize=(9, 6))
for sample, label, color, linewidth in [
(reference, "Reference MCMC", "black", 2.5),
(raw_pop, "ABC raw data", "#1f77b4", 1.8),
(suff_pop, "ABC sufficient summary", "#2ca02c", 1.8),
]:
density = np.histogram(sample, bins=15, density=True)
centers = 0.5 * (density[1][1:] + density[1][:-1])
plt.plot(centers, density[0], label=label, color=color, linewidth=linewidth)
plt.xlabel("lambda")
plt.ylabel("Density")
plt.title("Python: ABC posterior comparison")
plt.legend(frameon=False)
plt.tight_layout()
plt.show()
To get a sample from the “true” posterior, we first define the log-posterior function.
using AdaptiveMCMC
using MCMCChains
using Plots
using StatsPlots
function loglikelihood_survival(N::Int, t, y, λ)
S = exp.(-λ .* t)
S = [1; S; 0]
p = -diff(S)
y = [y; N - sum(y)]
logpdf(Multinomial(N, p), y)
end
## loglikelihood_survival (generic function with 1 method)
function logposterior(par)
ll = logpdf(prior, par[1])
if isfinite(ll)
ll += loglikelihood_survival(100, obs.time, obs.deaths, par[1])
end
ll
end
## logposterior (generic function with 1 method)
Then we use the AdaptiveMCMC package to run a MCMC
sampler to get a reference solution.
mcmc_res = adaptive_rwm([0.3], logposterior,
10_000;
algorithm = :ram,
b = 1000)
## (X = [0.14176944290985666 0.14176944290985666 … 0.12060954341005534 0.12060954341005534], allX = [[0.14176944290985666 0.14176944290985666 … 0.12060954341005534 0.12060954341005534]], D = [[-46.96961673073684, -46.96961673073684, -46.96961673073684, -46.96961673073684, -46.880997284759864, -46.880997284759864, -46.880997284759864, -46.880997284759864, -46.880997284759864, -46.880997284759864 … -49.19465912198601, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436, -46.99940517685436]], R = RWMState{1, Float64, Vector{Float64}, typeof(randn!), TaskLocalRNG}[RWMState{1, Float64, Vector{Float64}, typeof(randn!), TaskLocalRNG}(TaskLocalRNG(), Random.randn!, [0.12060954341005534], [-0.17914569545551848], [0.10796055605948822])], S = RobustAdaptiveMetropolis{1, Float64, Vector{Float64}, RAMStepSize{Float64}, Cholesky{Float64, Matrix{Float64}}}[RobustAdaptiveMetropolis{1, Float64, Vector{Float64}, RAMStepSize{Float64}, Cholesky{Float64, Matrix{Float64}}}(Cholesky{Float64, Matrix{Float64}}([0.0706087681526011;;], 'L', 0), 0.234, [-0.0004609339032577557], [-1.0], RAMStepSize{Float64}(PolynomialStepSize{Float64}(0.66, 1.0)))], Rhos = AdaptiveScalingMetropolis{PolynomialStepSize{Float64}, Float64}[], accRWM = [0.2005], accSW = Float64[], args = ([0.3], Main.logposterior, 10000), params = (algorithm = :ram, thin = 1, b = 1000, fulladapt = true, q = Random.randn!, L = 1, log_pr = AdaptiveMCMC.var"#adaptive_rwm##2#adaptive_rwm##3"{Float64}(), all_levels = false, acc_sw = 0.234, swaps = :single, rng = Xoshiro(0xcad878ab8dec2704, 0x6f179dde9cb395d8, 0xf1b6bfb676d6995c, 0xc0b9b300feaa62f5, 0x89a1b02b58068c5f)))
reference = vec(mcmc_res.X')
## 9001-element reshape(adjoint(::Matrix{Float64}), 9001) with eltype Float64:
## 0.14176944290985666
## 0.14176944290985666
## 0.14176944290985666
## 0.14176944290985666
## 0.14001428247842068
## 0.14001428247842068
## 0.14001428247842068
## 0.14001428247842068
## 0.14001428247842068
## 0.14001428247842068
## ⋮
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
## 0.12060954341005534
raw_pop = vec(sabc_res.population)
## 2000-element Vector{Float64}:
## 0.1998205844307822
## 0.25078438055253255
## 0.19803733746856148
## 0.19160402963705525
## 0.13309148439926782
## 0.1951078132320949
## 0.2144867448385722
## 0.21286569275937042
## 0.11826315237214481
## 0.2268378545292552
## ⋮
## 0.1673821570547553
## 0.25897116080558413
## 0.14921190963968584
## 0.18213801257693232
## 0.23831708576445715
## 0.16277633018826487
## 0.12910285080123418
## 0.1833203460647028
## 0.13825982632131723
suff_pop = vec(sabc_suff_res.population)
## 2000-element Vector{Float64}:
## 0.14125815939605937
## 0.15261320027926473
## 0.12351958225343332
## 0.11200323971319773
## 0.12394955422472846
## 0.16116795206415754
## 0.11491620722196183
## 0.16907528997063234
## 0.1401540752103395
## 0.16131984295800667
## ⋮
## 0.1435774105363064
## 0.13754663638229428
## 0.14158695549069425
## 0.1484768928927801
## 0.11841286633674755
## 0.1453647645056818
## 0.12313448251779335
## 0.15135857235392497
## 0.1195039502258659
plt = StatsPlots.density(reference, label = "Reference MCMC", xlab = "lambda",
title = "Julia: ABC posterior comparison",
linewidth = 3, color = :black);
StatsPlots.density!(plt, raw_pop, label = "ABC raw data", linewidth = 2);
StatsPlots.density!(plt, suff_pop, label = "ABC sufficient summary", linewidth = 2);
plt