Diagnosing loss ratio convergence

Motivation

find_ata_maturity() answers the question “from which development period are link factors $f_k$ reproducible across cohorts?”. That is necessary for chain-ladder projection but not sufficient for declaring a portfolio’s projected loss ratio converged: in long-duration health insurance both $f_k \to 1$ and $g_k \to 0$ arise mechanically from cumulative denominators growing, regardless of whether the underlying experience has actually settled. A criterion built on those quantities passes automatically with $k$ , not because of true convergence — what we have called the inertia failure mode.

find_lr_convergence() detects the convergence point $k^{**}$ — the first valuation $v \ge k^*$ at which the projected loss ratio has predictively converged. It is the natural counterpart to $k^*$ (maturity point, from find_ata_maturity()): $k^*$ marks where link factors $f_k$ become reproducible, while $k^{**}$ marks where the projection itself stops moving with new data. Long-duration health portfolios may cross $k^*$ early yet remain far from $k^{**}$ .

The detector combines two orthogonal conditions, both required to hold for $M$ consecutive valuations:

Predictive revision is small relative to its parameter SE: $R_v < c \cdot \hat{SE}^{\mathrm{param}}_v$ , where $R_v = |\hat{LR}^{\mathrm{proj}}_v(D_v) - \hat{LR}^{\mathrm{proj}}_v(D_{v-1})|$ is the change in the projected portfolio LR caused by adding one new calendar diagonal.
Cross-cohort dispersion of incremental LR is small: $\hat{D}_v < \tau$ , where $\hat{D}_v = 1.4826 \cdot \mathrm{MAD}_i(\hat{lr}_{i,v}) / |\mathrm{median}_i(\hat{lr}_{i,v})|$ .

Operating $\hat{D}_v$ on incremental rather than cumulative loss ratio keeps it inertia-free — per-period quantities have no cumulative denominator to dampen them. The two clauses guard against complementary failure modes: $R_v$ checks that the model output has stopped revising, while $\hat{D}_v$ checks that the raw period-by-period experience is genuinely consistent across cohorts at that dev. Either alone can be fooled — in chain-ladder projection the mechanical drift $\hat{f}_k \to 1$ collapses $R_v$ regardless of true convergence, and cross-cohort agreement on a single period’s level need not imply that the projection has settled. The dual criterion closes both inertia-leakage paths.

Why two conditions

A denominator effect disables any single-criterion diagnostic.

In long-duration health insurance, cumulative LR = cumulative loss / cumulative risk premium. As dev grows, the denominator grows alongside the numerator, so a new calendar diagonal’s contribution to the overall ratio shrinks automatically — regardless of whether the underlying experience has actually changed. This is the inertia effect.

What each criterion guards against:

Scenario	$R_v$	$\hat{D}_v$	Result
True convergence (model + experience both stable)	small	small	PASS ✓
Chain-ladder $\hat{f}_k \to 1$ drift (inertia)	small (spurious)	large	FAIL — dispersion catches it
Coincidental cohort agreement at one period	large	small (snapshot)	FAIL — projection revision catches it

$R_v$ alone is fooled by chain ladder’s mechanical drift ( $\hat{f}_k \to 1$ ): the cumulative product barely moves, the projection barely moves, and $R_v$ collapses to zero — a false convergence.
$\hat{D}_v$ uses incremental LR, so it inherits no cumulative denominator → immune to the inertia effect.
Requiring both criteria simultaneously closes the principal inertia-leakage paths. That is the design intent of the dual criterion.

Notation

Symbol	Meaning
$i$	cohort index (UY)
$v$	valuation index — the calendar diagonal; “ $v$ diagonals observed”
$V$	maximum observed valuation (max dev in the triangle)
$k^*$	maturity point (from `find_ata_maturity()`); lower bound on candidate $v$
$k^{**}$	convergence point — the value `find_lr_convergence()` returns
$\hat{LR}^{\mathrm{proj}}_v$	projected ultimate LR using data through valuation $v$
$R_v$	revision: $\lvert\hat{LR}^{\mathrm{proj}}_v - \hat{LR}^{\mathrm{proj}}_{v-1}\rvert$
$\hat{SE}^{\mathrm{param}}_v$	parameter-uncertainty SE of $\hat{LR}^{\mathrm{proj}}_v$ (Mack-style)
$\hat{lr}_{i,v}$	incremental loss ratio of cohort $i$ at dev $v$
$\hat{D}_v$	robust scale-invariant dispersion of $\hat{lr}_{i,v}$ across cohorts
$c$	multiplier on $\hat{SE}^{\mathrm{param}}_v$ for the revision gate (default `0.5`)
$\tau$	upper bound on $\hat{D}_v$ for the dispersion gate (default `0.15`)
$M$	required run length of consecutive passing valuations (default `3L`)

The constant $1.4826 \approx 1 / \Phi^{-1}(0.75)$ inside $\hat{D}_v$ is the standard MAD $\to\sigma$ correction: with this scaling $\mathrm{MAD}_i$ becomes a consistent estimator of the cross-cohort standard deviation under normality, so $\hat{D}_v$ reads as a robust, outlier-resistant coefficient of variation of incremental LR.

Basic usage

library(lossratio)
data(experience)
exp <- as_experience(experience)
tri <- build_triangle(exp[cv_nm == "SUR"], cv_nm)

res <- find_lr_convergence(tri)
print(res)

Mock output:

#> <LRConvergence>
#> k_conv       : NA
#> k_star       : 9
#> V (max dev)  : 30
#> criterion    : R_v < 0.5 * SE_param_v  AND  D_v < 0.15  (run M = 3)
#> fit_fn       : fit_lr
#> v candidates : 19 ( 0  pass both clauses)

The returned LRConvergence object reports:

k_conv — the detected $k^{**}$ , or NA if no run of $M$ consecutive passing valuations is found.
k_star — the maturity point used as the lower bound (computed internally via find_ata_maturity() on a clr-based ATA, or supplied by the caller).
V — the maximum observable dev in the triangle.
v, R_v, SE_param_v, D_v, pass_v — per-valuation diagnostic sequences indexed by $v$ .
c, tau, M, holdout_max, min_n_cohorts — settings used.
attributes group_var, value_var, fit_fn_name, dev_var.

summary(res) returns a data.table with one row per candidate valuation and an extra R_over_SE = R_v / SE_param_v column for inspection:

head(summary(res), 6)

#>        v    R_v   SE_param_v  R_over_SE   D_v     pass
#> 1:     9     NA           NA         NA  0.90   FALSE
#> 2:    10     NA           NA         NA  0.76   FALSE
#> 3:    11     NA           NA         NA  0.56   FALSE
#> 4:    12     NA           NA         NA  0.58   FALSE
#> 5:    13     NA           NA         NA  0.81   FALSE
#> 6:    14     NA           NA         NA  0.43   FALSE

How it works: multiple holdout refits

find_lr_convergence() refits the model at each candidate valuation and tracks how the projection changes.

Example: with $V = 30$ , $k^* = 18$ , and holdout_max = 6 — candidates are $v \in \{24, 25, \dots, 30\}$ (7 in total).

$v$	holdout depth ( $V - v$ )	$R_v$ available?
30	0	✓
28	2	✓
24 (cutoff)	6	✓
22	8	`NA`
18	12	`NA`

One refit per candidate $v$ (7 total) plus $R_v$ computed between adjacent $v$ . The cutoff holdout_max defaults to floor((V - k_star) / 2); once holdout depth exceeds it the refit data becomes too thin and $R_v$ , $SE_param_v$ are masked to NA.

Increase holdout_max to diagnose deeper into the past — at the cost of less data behind each refit, hence lower confidence.

Plot

plot(res)

The diagnostic is two stacked panels: the upper panel shows $R_v / \hat{SE}^{\mathrm{param}}_v$ against $v$ with a horizontal guide at $c$ ; the lower panel shows $\hat{D}_v$ against $v$ with a horizontal guide at $\tau$ . A vertical dotted line marks $k^*$ , and a vertical solid line marks $k^{**}$ when one is detected. A point falling below both threshold lines passes the joint criterion.

This view is also a quick way to see which clause is binding. If the top panel hugs the threshold but the bottom is far above, the issue is cross-cohort heterogeneity; if the bottom is fine but the top is high, the model is still revising.

Threshold tuning

The defaults are deliberately conservative:

Argument	Default	Meaning
`c`	`0.5`	Revision must be smaller than half the parameter SE.
`tau`	`0.15`	Cross-cohort dispersion must be below 15% of the median lr.
`M`	`3L`	Both clauses must hold for at least 3 consecutive valuations.
`min_n_cohorts`	`5L`	Below this cohort count, $\hat{D}_v$ is `NA` (insufficient sample).

Tighter thresholds yield later (or no) $k^{**}$ ; sweep a range to inspect sensitivity:

sapply(
  c(0.25, 0.5, 0.75, 1.0),
  function(cc) find_lr_convergence(tri, c = cc)$k_conv
)

Values of $\hat{D}_v$ below $\tau \approx 0.05$ are difficult to attain in real portfolios because of single-period claim noise; values above $0.20$ usually indicate genuine cohort heterogeneity that warrants detect_cohort_regime() before further modelling.

Relation to `k^*` and `detect_cohort_regime()`

The three diagnostics answer different questions and operate on different axes:

Tool	Question	Result	Axis
`detect_cohort_regime()`	Are cohorts homogeneous?	cohort groups	underwriting period
`find_ata_maturity()` ( $k^*$ )	When are link factors reproducible?	a dev value	development period
`find_lr_convergence()` ( $k^{**}$ )	When does the LR estimate stop revising?	a dev value	development period

A defensible workflow is:

Run detect_cohort_regime(). If multiple regimes exist, fit each group separately.
For each homogeneous group, compute $k^*$ via find_ata_maturity().
Run find_lr_convergence() to obtain $k^{**} \ge k^*$ . The reported $\hat{clr}^{\mathrm{stable}}$ is the LR averaged over $k \ge k^{**}$ (or projected via fit_lr()).

The sequence separates cohort homogeneity, link reproducibility, and level convergence — three properties that coincide in P&C run-off but must be verified independently in long-duration health insurance.

Limitations

find_lr_convergence() is a thin layer over repeated backtest() calls and inherits their constraints:

Identifiability: $k^{**}$ can be declared only when $V \ge k^* + M$ ; short observation windows return NA.
Model conditioning: $\hat{LR}^{\mathrm{proj}}_v$ is computed by fit_fn (default fit_lr). Different fitters yield different $k^{**}$ . Reporting under multiple fit_fn is recommended for robustness.
Portfolio aggregation: $R_v$ and $\hat{SE}^{\mathrm{param}}_v$ are exposure-weighted across cohorts assuming inter-cohort independence. Calendar-year shocks (regulatory, healthcare cost trend) violate this assumption; both clauses can move together for non-cohort reasons.
Multi-group triangles: the helper currently collapses $\hat{D}_v$ across groups by median; running each group separately is recommended when groups behave differently.