Changelog • lossratio

lossratio (development version)

BREAKING: Regime treatment redesign – two full-development treatments. The treatment enum is now c("segment_bridged", "segment_bridged_borrowed") (default "segment_bridged"). The previous values "latest_only", "segment_wise", and "segment_wise_bridged" are removed: each either failed to project the newest cohorts to full development ("latest_only" shortens the horizon to the surviving subset; "segment_wise" leaves the newest segment’s late-dev cells unreachable) or was a half-step toward the redesign. Both new treatments mask the triangle to a bridged development band – the per-segment mini-triangle wall (dev >= max_cal - seg_last + 1) widened by a calendar-diagonal bridge to the next segment’s first-cohort midpoint dev – which closes the factor gaps at the segment boundaries so a continuous run of age-to-age factors covers every development period and every cohort projects to the full development length.
- "segment_bridged" (default) pools the whole band into a single factor set (the development pattern is shared across regimes; only the band’s lower boundary is regime-aware). The masked band drops its segment_id tag so downstream estimation is pooled.
- "segment_bridged_borrowed" estimates factors per segment (early-dev factors stay regime-specific) and borrows the late-dev factors a segment cannot reach from a donor segment that can (the most recent segment that developed that far). New helper .borrow_segment_factors() performs the projection-time augmentation in fit_cl() / fit_ed().
The band mask is applied to the Triangle (cohort x dev grid) before the Link is built, because the Link omits dev-1-only cohorts and would corrupt each segment’s last-cohort rank. The cohort-cut mechanism that backed "latest_only" survives internally for the stage-adaptive (fit_sa()) hybrid filter, which is not a user-facing treatment. Helper .compute_segment_mini_tri_bounds() still computes the per-cell band dev_min shared by .apply_regime_filter() and .compute_triangle_usage().
BREAKING: identifier rename exposure -> premium. The denominator slot reverts to premium, the natural domain word for loss-ratio analytics (loss ratio = loss / premium). The earlier framework-generic choice exposure (still unreleased) is superseded: “cumulative exposure” / “incremental exposure” read as non-standard in an insurance reserving context, whereas “cumulative premium” / “incremental premium” are immediately clear. The derived ratio column / fit family is unchanged. Prose noun phrases (“loss ratio”, “premium”, “risk premium”, “premium measure”) are unchanged, and the exposure-driven (ED) method keeps its name.

Migration (find-and-replace at call sites):
- as_triangle(..., exposure = "incr_exposure") -> as_triangle(..., premium = "incr_premium")
- fit_exposure(...) -> fit_premium(...)
- fit_loss(..., exposure_method = ..., exposure_alpha = ..., exposure_fit = ...) -> fit_loss(..., premium_method = ..., premium_alpha = ..., premium_fit = ...)
- fit_ratio(..., exposure_method = ..., exposure_alpha = ..., exposure_regime = ...) -> fit_ratio(..., premium_method = ..., premium_alpha = ..., premium_regime = ...)
- backtest(..., target = "exposure", exposure_method = ...) -> backtest(..., target = "premium", premium_method = ...)
- bootstrap(tri, target = "exposure") -> bootstrap(tri, target = "premium")
- RatioFit$exposure_alpha, $exposure_regime -> $premium_alpha, $premium_regime
- LossFit$exposure_fit -> $premium_fit
- S3 class "ExposureFit" -> "PremiumFit" (incl. print.ExposureFit() / summary.ExposureFit() -> print.PremiumFit() / summary.PremiumFit())
- attr(ExposureFit_obj, "exposure_method") -> attr(PremiumFit_obj, "premium_method")
- Triangle / Calendar / Total columns: exposure, incr_exposure, exposure_share, incr_exposure_share -> premium, incr_premium, premium_share, incr_premium_share
- Fit output columns: exposure_proj, exposure_obs, exposure_proc_se, exposure_param_se, exposure_total_se, exposure_total_cv, exposure_ci_lo, exposure_ci_hi, incr_exposure_proj -> premium_*
- R source file: R/exposure.R -> R/premium.R
- Raw dataset (data/experience.rda): column incr_exposure -> incr_premium (regenerated)
The maturity_from column reported for method = "sa" fits (in summary() and the underlying $full table) now carries the maturity link’s to-index (ata_to, the first CL-phase dev) instead of its from-index, matching the package convention that a maturity point is a link target dev. Projections, standard errors, and confidence intervals are unchanged – only the reported column value shifts up by one.
API consistency pass. Several entry points were aligned so the same concept behaves the same way regardless of entry point:
- premium_method now defaults to "ed" (was "cl") in fit_sa() / fit_loss() / fit_ratio() / backtest(), matching fit_premium(). This changes the default premium-side variance recursion; point projections are unaffected. Pass premium_method = "cl" to keep the old behaviour.
- bootstrap()’s type now defaults to "parametric" (was "analytical"), matching the fit_sa() / fit_bf() / fit_cc() workers. Pass type = "analytical" for the closed-form path.
- fit_ratio() and backtest() accept "bf" and "cc" as loss-side methods (forwarded to fit_loss()); supply the prior arguments through ....
- fit_ratio() gained a tail argument, forwarded to fit_loss().
Buehlmann-Straub credibility blend for fit_bf() / fit_cc(). A new credibility argument switches the BF / CC blend weight from the emergence fraction q to a Buehlmann-Straub credibility factor Z = K / (K + s^2), where s^2 is the variance of the cohort’s own CL loss-ratio estimate and K is the between-cohort variance of the true loss ratios (estimated per group, or supplied). credibility = NULL (default) keeps the classical blend. The credibility weight protects rare-event and very green cohorts: a CL estimate built on almost no data has a large s^2, so Z shrinks toward 0 and the cohort is pulled to the prior even when its q is high. The fit carries a $credibility slot with the per-cohort Z / K.
Analytical prediction error for fit_bf() / fit_cc(). type = "analytical" is now implemented (previously a stub error). It computes the closed-form mean squared error of prediction via the Mack (2008) Bornhuetter-Ferguson MSEP decomposition – process error plus development-pattern and prior estimation error – without simulation. $summary carries loss_total_se / loss_total_cv / loss_ci_lo / loss_ci_hi; fit_cc() additionally reports elr_cc_se / elr_cc_cv / elr_cc_ci_lo / elr_cc_ci_hi for the data-estimated pooled ELR. The analytical path is also used whenever no bootstrap is requested, so every fit now reports an SE.
Distribution prior for fit_bf(). A data.frame prior may carry an optional elr_se column – the standard error of the a priori ELR. The bootstrap path then draws a per-replicate ELR from Normal(elr, elr_se), and the analytical path feeds it into the Var(ELR) term. A deterministic prior (no elr_se) is unchanged.
Per-group prior for fit_bf(). A prior data.frame may carry the grouping columns plus elr without a cohort column; the group’s ELR is then broadcast to every cohort in that group.
Worker layer fix + bootstrap arg on fit_cl / fit_ed. fit_bf() / fit_cc() / fit_sa() now build their internal premium fit by calling fit_cl(loss = "premium", ...) directly instead of routing through fit_premium() — downward-only Tier 3 -> Tier 4 dependency. fit_cl() and fit_ed() both gain bootstrap, B, seed, conf_level arguments for symmetry with the SA / BF / CC workers; bootstrap = NULL (default) preserves analytical Mack SE. All fit-result classes standardised to c("XFit", "list") (or prepended forms for the dispatchers).
fit_bf() + fit_cc() promoted to peer workers with bootstrap composition. fit_bf takes an external prior ELR (Bornhuetter-Ferguson 1972); fit_cc derives ELR from data via payout weighting (Stanard 1985, Cape Cod). Both expose bootstrap = "auto" for cell / link / parametric simulation and analytical fallback. Promotion makes them available through fit_loss(method = "bf" | "cc").
fit_sa worker + fit_loss true dispatcher. Phase 4 split the stage-adaptive composition (R/sa.R, class "SAFit") out of R/loss.R, and fit_loss() now thin-dispatches by method to the worker functions (fit_ed / fit_cl / fit_sa / fit_bf / fit_cc) and augments their output to the LossFit-uniform $full schema via .lossfit_augment(). fit_premium() follows the same pattern with .premiumfit_augment() + .premiumfit_bootstrap() (Phase 4c).
BREAKING: bootstrap type = "parametric" -> "analytical" rename. The Mack closed-form SE option was previously mislabelled as "parametric". The textbook-parametric kernels (cell-distribution sampling + refit, England-Verrall 1999) now use type = "parametric", and the analytical Mack closed-form lives at type = "analytical". bootstrap.Triangle() default is "analytical"; worker-side fit_sa / fit_bf / fit_cc type = defaults to "parametric" (cell-distribution simulation).
SA nonparametric bootstrap proper kernel. bootstrap(method = "sa") previously silently dispatched to the CL cell kernel. Phase 1 introduced the dedicated bootstrap_kernel_sa_cell (and link variants) that respect the stage transition at maturity k^*, with ED-stage cells using additive g_k refit and CL-stage cells using multiplicative f_k refit.
ED bootstrap (Phase 1, fixed premium). bootstrap() now supports method = "ed" for residual = "cell": per-replicate g*_k refit and additive forward projection (Delta loss = g_k * P_{from} + noise) instead of the multiplicative chain ladder. Premium stays fixed across replicates (projected once via CL on the premium column). New native helpers bootstrap_refit_gstar / bootstrap_fwd_proj_ed_and_clip / bootstrap_fwd_sim_ed_cell parallel the CL kernel triple; the C entry point is C_bootstrap_kernel_ed_cell (17 args). Phase 2 / 3 (joint loss + premium bootstrap) deferred. method = "ed" requires residual = "cell"; ED + link residuals is not implemented.
bootstrap() method-enum reorder — c("sa", "cl", "ed") -> c("ed", "cl", "sa"). Matches the fit_loss() / fit_ratio() default flip. Default is now "ed". Users relying on "sa" as the bootstrap method must pass it explicitly.
BREAKING: method default flip — "sa" -> "ed". fit_loss(), fit_ratio(), and backtest() (via loss_method) now default to method = "ed" (exposure-driven) instead of "sa" (stage-adaptive). Method-enum order is c("ed", "cl", "sa") — simple -> classical -> composition. Rationale: ED is the unconditional safe baseline (additive, no maturity-detection dependency, robust under early-dev age-to-age volatility); CL is the classical Mack 1993 alternative; SA is the composition of ED + CL requiring 2-pass maturity detection. Users relying on stage-adaptive behaviour must now pass method = "sa" explicitly. Migration:
- fit_loss(tri) -> still works (now defaults to ED)
- fit_loss(tri, method = "sa") -> explicit (no change)
- Want previous SA default behaviour back? -> add method = "sa"
Variance helper rename — .mack_g_var -> .ed_g_var. Internal factor-level variance helper renamed for paradigm clarity. .mack_* is reserved for the CL/Mack 1993 paradigm (f-factor variance); ED intensity variance follows the Buehlmann-Straub 1970 lineage and now lives at .ed_g_var(). The two natural analytical variance helpers in the package are now: .mack_f_var() (CL paradigm, f) and .ed_g_var() (ED paradigm, g). Cross-paradigm pairs are not provided as separate functions — they are algebraically derivable via g_k = f_k - 1. Both helpers now carry @references blocks citing Mack (1993) and Buehlmann-Straub (1970) respectively.
BREAKING: worker-arg rename target -> loss. Worker-layer functions (fit_cl, fit_ed, fit_ata, fit_intensity, detect_maturity, detect_regime) and as_link() now take a loss argument in place of target. Worker-output columns rename accordingly (target_obs -> loss_obs, target_proj -> loss_proj, target_*_se -> loss_*_se, target_from / target_to / target_delta -> loss_from / loss_to / loss_delta). Fit-object attribute key attr(., "target") becomes attr(., "loss") on Link, ATAFit, EDFit, CLFit, IntensityFit, Maturity, Regime, and Convergence objects. The target arg on backtest() (a dispatcher enum selecting "ratio" / "loss" / "premium") is unchanged — that is a different semantic (which metric to backtest) and stays as-is. Bootstrap’s target = c("loss", "premium") enum is also unchanged.

Migration:
- fit_cl(tri, target = "loss") -> fit_cl(tri, loss = "loss")
- fit_ed(tri, target = "loss", premium = ...) -> fit_ed(tri, loss = "loss", premium = ...)
- as_link(tri, target = "loss") -> as_link(tri, loss = "loss")
- detect_maturity(tri, target = "loss") -> detect_maturity(tri, loss = "loss")
- detect_regime(tri, target = "ratio") -> detect_regime(tri, loss = "ratio")
- Reading cl_fit$full$target_proj -> cl_fit$full$loss_proj
- Reading attr(link, "target") -> attr(link, "loss")
BREAKING: identifier rename prem -> exposure, lr -> ratio. Framework-generic naming for the denominator slot (exposure) and the derived ratio column / fit family (ratio). The previous in-progress sweep premium -> prem (still unreleased) is superseded — final target is exposure, the framework-generic word that covers loss reserving (risk premium = exposure), frequency (insureds = exposure), and severity (claim count = Bühlmann natural weight = exposure) uniformly. Prose noun phrases (“loss ratio”, “premium”, “risk premium”, “exposure measure”) are unchanged.

Migration (find-and-replace at call sites):
- as_triangle(..., premium = "incr_prem") / as_triangle(..., prem = "incr_prem") -> as_triangle(..., exposure = "incr_exposure")
- as_triangle(..., development = "dev_m") -> as_triangle(..., dev = "dev_m")
- validate_triangle(..., development = "dev_m") -> validate_triangle(..., dev = "dev_m")
- fit_premium(...) / fit_prem(...) -> fit_exposure(...)
- fit_lr(...) -> fit_ratio(...)
- fit_loss(..., prem_method = ..., prem_alpha = ..., prem_fit = ...) -> fit_loss(..., exposure_method = ..., exposure_alpha = ..., exposure_fit = ...)
- fit_lr(..., prem_method = ..., prem_alpha = ..., prem_regime = ...) -> fit_ratio(..., exposure_method = ..., exposure_alpha = ..., exposure_regime = ...)
- backtest(..., target = "lr", prem_method = ..., prem_alpha = ...) -> backtest(..., target = "ratio", exposure_method = ..., exposure_alpha = ...)
- backtest(..., target = "prem") -> backtest(..., target = "exposure")
- bootstrap(tri, target = "prem") -> bootstrap(tri, target = "exposure")
- LRFit$prem_alpha, LRFit$prem_regime -> RatioFit$exposure_alpha, $exposure_regime
- LossFit$prem_fit -> $exposure_fit
- S3 class "PremFit" -> "ExposureFit" (incl. print.PremFit() / summary.PremFit() -> print.ExposureFit() / summary.ExposureFit())
- S3 class "LRFit" -> "RatioFit" (incl. print.LRFit() / summary.LRFit() / plot.LRFit() / plot_triangle.LRFit() -> *.RatioFit())
- attr(PremFit_obj, "prem_method") -> attr(ExposureFit_obj, "exposure_method")
- Triangle / Calendar / Total columns: prem, incr_prem, prem_share, incr_prem_share, lr, incr_lr -> exposure, incr_exposure, exposure_share, incr_exposure_share, ratio, incr_ratio
- Fit output columns: prem_proj, prem_obs, prem_proc_se, prem_param_se, prem_total_se, prem_total_cv, prem_ci_lo, prem_ci_hi, incr_prem_proj -> exposure_*; lr_proj, lr_se, lr_cv, lr_ci_lo, lr_ci_hi, lr_ult, lr_latest, incr_lr_proj -> ratio_*
- R source files: R/prem.R, R/lr.R, R/lr-vis.R -> R/exposure.R, R/ratio.R, R/ratio-vis.R
- Raw dataset (data/experience.rda): column incr_prem -> incr_exposure (regenerated)
The package name lossratio is unchanged; this sweep is purely a code-identifier refactor. The conceptual “loss ratio” framework is preserved — ratio is the column / fit name for the loss-to-exposure ratio, and the package continues to specialise in long-term health insurance reserving on developing exposure (risk premium) triangles.
Default flip — bootstrap()’s keep_pseudo default changes from TRUE to FALSE. The long-format $pseudo_triangles long-format data.table is no longer built on every call; the precomputed $summary (Pythagorean SE decomposition + optional percentile CI) is still always present. Skipping the long-format reshape saves roughly 250-300 ms and ~200 MB on a typical 4-group monthly triangle at B = 999. Users who inspect $pseudo_triangles directly should pass keep_pseudo = TRUE explicitly; the argument is unchanged.
BREAKING — the four constructor functions are renamed from build_* to as_* to align with the tidyverse coercion idiom and the Python sibling’s lr.Triangle(df) mental model:
- build_triangle() -> as_triangle()
- build_calendar() -> as_calendar()
- build_total() -> as_total()
- build_link() -> as_link() No signature change, only the verb. Migration is a global find-and-replace. The functions still validate, coerce, and aggregate substantively – the as_* name reflects that the returned object is the canonical lossratio shape derived from the raw experience data, not just a thin type cast. The PascalCase classes (Triangle, Calendar, Total, Link) remain unchanged.
BREAKING — plot_triangle.Triangle() argument type renamed to view for parity with plot_triangle.CLFit(), plot_triangle.LRFit(), and plot_triangle.Backtest(), which already used view = c("value", "usage"). The type = slot is left free for plot-method-specific semantics (plot.Backtest(type = "col"/"diag"/"cell"), plot.CLFit(type = "projection"/"reserve"), etc.). Migration: plot_triangle(tri, type = "usage") -> plot_triangle(tri, view = "usage").
BREAKING — backtest() cell-level columns renamed from target_actual / target_proj (and _incr siblings) to actual / expected (and actual_incr / expected_incr). The new names match the actuarial A/E convention (aeg = actual - expected, ae_err = actual / expected - 1) and self-document the role of each column. Worker-layer column names (target_proj etc. on CLFit$full / EDFit$full) are unchanged – the rename is scoped to backtest$ae_err. Migration: replace bt$ae_err$target_actual with bt$ae_err$actual, bt$ae_err$target_proj with bt$ae_err$expected.
backtest() result slot fit_fn_name renamed to dispatcher for clarity (the value is still the dispatcher name — fit_ratio / fit_loss / fit_premium — selected by target=). print() / summary() labels updated accordingly.
fit_loss(), fit_premium(), fit_ratio(), and backtest() now attach a $usage data.table to the result: one row per (group, cohort, dev) cell of the pre-filter triangle with a status factor (used / unused / holdout / future). plot_triangle(fit, view = "usage") reads this directly instead of re-deriving the filter logic at plot time, so the heatmap always matches the cells the fit actually saw. New internal helper .build_usage() packages the 2-pass maturity detection plus .compute_triangle_usage() and attaches filter metadata (regime / recent / holdout / m_k / m_k_dt) as data.table attributes for the renderer.
BREAKING — build_triangle(), build_total(), and validate_triangle() rename their dev = argument to development =. The new name is more explicit about the development-period axis (matching the coh <- cohort symmetry inside the function bodies). Migration: replace build_triangle(..., dev = "dev_m") with build_triangle(..., development = "dev_m").
BREAKING — backtest() result columns renamed from value_proj / value_actual (and _incr variants) to target_proj / target_actual (and _incr variants), matching the worker-layer target_* generic convention.
BREAKING — summary column renames for <metric>_<stat> consistency:
- summary.LossFit / summary.PremiumFit: se_ultimate / cv_ultimate -> ultimate_se / ultimate_cv.
- .compute_dv() output: median_lr / mad_lr -> lr_median / lr_mad.
- Internal var_lr (LR variance scratch column) -> lr_var.
- detect_regime_optimal_window() diagnostics column: mean_magnitude -> magnitude_mean.
plot_triangle() now derives the axis grain via attr(tri, "grain") when the raw column name (uy_m, cy_q, …) is not one of the package-standard forms, so user-supplied names like uym or elap_m still render tick labels in the abbreviated format (23.04, 23.1Q, …).
plot_triangle(fit, view = "usage") regime / segment_wise routing fixed: per-group dispatch now honours regime$groups even when multi_group = FALSE, so regime_at(coverage = "SUR", ...) scopes the hline / cohort cut to the SUR facet only. Filtered-out cells render as unused (gray) instead of future (white).
.datatable.aware <- TRUE declared in R/zzz.R; data.table NSE NOTEs suppressed via mlr3-style (".col") := LHS pattern + a reorganised globalVariables() list. Internal temp markers (.col prefix) use function-local NULL bindings per the data.table official recommendation.
BREAKING — as_experience(), check_experience(), and is_experience() removed along with the Experience S3 class. build_triangle() already validates required columns, coerces dates and numerics, and aggregates inline, so the explicit coercion step is no longer needed (and the class itself was never required by any downstream function). Migration: replace exp <- as_experience(df); build_triangle(exp, ...) with build_triangle(df, ...). Matches Python sibling 0.0.1.dev7.
New fit_intensity() + IntensityFit S3 class (R/intensity.R) — factor-level ED diagnostic, parallel to fit_ata() for the multiplicative side. Returns per-link WLS-estimated intensities g_k with standard errors and diagnostic stats; no projection. ED has no maturity concept, so fit_intensity deliberately omits maturity_args.
Link cell-level column g renamed to intensity (concept-based, parallels ATA’s ata). Summary / fit per-link output columns (g, g_se, g_var, g_selected) keep Mack-style symbol naming for parallelism with ATA summary’s f, f_se. Layered naming: cell layer uses concept (intensity), summary layer uses symbol (g).
backtest(): cell-level metric and aggregation columns renamed from aeg to ae_err (column ae_err, aggregations ae_err_mean / ae_err_med / ae_err_wt). Print and plot labels updated to “A/E Error”. Formula unchanged: (actual - pred) / pred.

lossratio 0.0.0.9000

Core API

Aggregation: build_triangle() (cohort × dev), build_calendar() (calendar period), build_total() (portfolio total). build_triangle() validates schema and coerces required columns inline.
Link table: build_link() returns a Link object covering both single-variable (ATA-style) and dual-variable (ED-style) workflows. summary.Link(model = "ata"|"ed") dispatches to the matching diagnostic.
Estimation: fit_ata() (per-link factors only); fit_ed(), fit_cl(), and fit_lr() (factors + projection). fit_lr supports three methods — "sa" (default), "ed", "cl".
Cell-selection diagnostics: detect_maturity() (dev axis — link beyond which ATA factors are stable), detect_regime() (cohort axis — structural breaks across underwriting cohorts).
Projection diagnostic: detect_convergence() (operates on a fitted LRFit; valuation depth at which projected ultimate loss ratio stops revising).
Backtest: backtest() (calendar-diagonal hold-out, supports fit_cl, fit_ed, and fit_lr).
Visualisation: S3 plot() and plot_triangle() methods on every fit class.

Dataset

experience — 2,664-row synthetic example data, generated by data-raw/make_experience.R.

Documentation

Seven vignettes covering getting started, aggregation frameworks, chain ladder, loss-ratio projection methods, triangle / ata diagnostics, regime detection, and backtesting.