Example 6: pricing the tail, with error bars¶

One block of claims, two defensible severity models, and an 82% disagreement on the pooling charge — this page is about making that disagreement visible instead of discovering it in a rate filing. Fit and compare severities with lossmodels, carry parameter uncertainty into the limits factors, scan the tail with extremeloss, and price the same excess layer both ways through the ratingmodels pooling seam. Every number on this page is the output of this exact fixed-seed run, pinned by a regression test in the extremeloss suite.

The claims, and the model contest¶

import numpy as np
import lossmodels as lm
import extremeloss as el
import ratingmodels as rm

rng = np.random.default_rng(2026)
claims = rng.lognormal(mean=9.4, sigma=1.15, size=2_400)

lm.compare_fits({"lognormal": lm.fit_lognormal(claims),
                 "gamma": lm.fit_gamma(claims),
                 "weibull": lm.fit_weibull(claims)}, claims)

model	aic	ks	ad
lognormal	52,458.394	0.009	0.201
weibull	52,846.879	0.067	27.722
gamma	52,930.563	0.086	36.741

Not a close contest: the lognormal wins the body (K–S) and the tail (Anderson–Darling) simultaneously. Note the shape of the losses for the runners-up — a gamma that misses by 0.086 on K–S misses by 36.7 on A–D, because A–D weights exactly the region an excess layer lives in.

Parameters with error bars, factors with error bars¶

best = lm.fit_lognormal(claims)
unc = lm.fit_uncertainty(best, claims)
unc.summary()

parameter	estimate	se	ci_low	ci_high
mu	9.3652	0.0236	9.3190	9.4114
sigma	1.1547	0.0167	1.1220	1.1874

The delta method carries that covariance to the factor actually filed:

lm.increased_limits_table(best, limits=[250_000, 500_000, 1_000_000],
                          base_limit=250_000, uncertainty=unc)

limit	ilf	ilf_se	ci_low	ci_high
250,000	1.0000	0.0000	1.0000	1.0000
500,000	1.0182	0.0023	1.0138	1.0227
1,000,000	1.0228	0.0031	1.0167	1.0288

An ILF of 1.0182 ± 0.0023 off 2,400 claims: the fourth decimal was never real, and now the table says so.

The tail, scanned with bands¶

grid = np.quantile(claims, [0.85, 0.90, 0.93, 0.95])
scan = el.threshold_diagnostic_table(claims, grid)

threshold quantile	xi	xi_se
0.85	0.365	0.070
0.90	0.450	0.097
0.93	0.403	0.113
0.95	0.380	0.133

Within their bands the shape estimates are indistinguishable — which is itself the finding. A lognormal has no true Pareto tail; what a GPD sees at any finite threshold is a local effective shape around 0.4 that never stabilizes to a limit. The scan cannot rule the GPD out, but nothing here demands it either.

u = float(np.quantile(claims, 0.90))
fit = el.fit_pot(claims, threshold=u)        # u = 51,810
# xi = 0.4502 (se 0.0972), beta = 29,152 (se 3,294)

el.gpd_return_level(fit, [10, 50], observations_per_period=2_400.0)

return period	return level	se
10 years	2,140,226	1,051,790
50 years	4,430,985	2,838,106

A 10-year loss of 2.1m ± 1.1m: the tail is estimable, barely, and the band says exactly how barely.

The same layer, priced both ways¶

Both models satisfy the two-method protocol (sf + mean_excess), so both feed the pooling seam directly:

for sev in (best, fit):
    rm.pooling_charge_from_severity(sev, pooling_point=250_000,
                                    expected_frequency=0.9)

model	S(250k)	mean excess	charge per exposure
lognormal body	0.00398	132,302	474.28
GPD tail (u = q90)	0.00445	215,323	862.02

Same claims, same attachment, an 82% difference — almost entirely from the mean excess, because a fitted ξ of 0.45 grows the excess linearly in the attachment while the lognormal’s effective shape keeps decaying. Here the evidence adjudicates: the generator is lognormal and every diagnostic above said so. On real data it usually will not be this clean, and then this two-row table is the risk-margin conversation — the model choice priced in currency, with the compare-fits scorecard and the ξ bands as the evidence on each side.