Source code for actuarialpy.adjustments

"""General factor application -- the restatement spine.

Most of experience rating is "take a base amount and carry it through a chain of
factors": completion to ultimate, trend, benefit relativity, area, age/sex or other
demographic loads, network discounts. :func:`adjust` is that move, once: join a factor
to each row by a key (a column already in the frame, optionally within a segment), then
multiply or divide the value by it. :func:`completion <actuarialpy.apply_completion>` and
:func:`deseasonalize <actuarialpy.deseasonalize>` are the same move with the key *derived*
from a date (a development period, a season); ``adjust`` is the general case where the key
is an ordinary column.

The library deliberately does not encode any particular method here: it takes the factors
as input -- a credibility table, an externally-sourced trend, a filed relativity -- and
applies them mechanically, with the same validated join (unique-key / fan-out guard,
surfaced gaps, index-independent) used everywhere else.
"""

from __future__ import annotations


import numpy as np
import pandas as pd

from actuarialpy.columns import as_list, factor_lookup, validate_columns


[docs] def adjust( df: pd.DataFrame, factors: float | int | pd.Series | pd.DataFrame, *, value_col: str, on: str | list[str] | None = None, by: str | list[str] | None = None, how: str = "multiply", factor_col: str = "factor", out_col: str | None = None, audit_col: str | None = None, default: float | None = None, copy: bool = True, ) -> pd.DataFrame: """Multiply or divide a column by a factor joined on a key. The general factor-application primitive behind trend, benefit / area / demographic relativities, network discounts -- any per-key multiplier. The factor for each row is taken from one of: - a **scalar** ``factors`` -- one factor for every row (e.g. a single trend factor); - a **Series** indexed by ``on`` -- one key column (e.g. an area factor by region); - a tidy **DataFrame** keyed by ``by + on`` with ``factor_col`` -- per-segment factors (the shape the ``*_by`` estimators return). and applied to ``value_col``: ``how="multiply"`` gives ``value * factor`` (loads, trend), ``how="divide"`` gives ``value / factor`` (backing a factor out). The join is by value (the frame's index never participates); the factor table must be unique on its keys -- a duplicate would fan out the data -- which is enforced. An absent key gives ``default`` (``NaN`` when ``default`` is ``None`` -- a surfaced gap, never silently filled); pass ``default=1.0`` when a key missing from the table should mean "no adjustment". With ``audit_col``, the cumulative *net multiplier* applied to ``value_col`` is accumulated there (``factor`` for multiply, ``1 / factor`` for divide), so a chain of adjustments leaves a per-row record of total restatement. """ if how not in ("multiply", "divide"): raise ValueError("how must be 'multiply' or 'divide'") on_cols = as_list(on) by_cols = as_list(by) validate_columns(df, [value_col] + on_cols + by_cols) result = df.copy() if copy else df if isinstance(factors, pd.DataFrame): keys = by_cols + on_cols if not keys: raise ValueError("Pass on=... (and optionally by=...) naming the key column(s) for a factor table.") factor = factor_lookup(result, factors, keys, factor_col=factor_col, default=default) elif isinstance(factors, pd.Series): if len(on_cols) != 1: raise ValueError("Pass on=<column> (one key) when factors is a Series indexed by that key.") if by_cols: raise ValueError("by= needs a tidy DataFrame of per-segment factors, not a Series.") factor = np.array(result[on_cols[0]].map(factors), dtype="float64") if default is not None: factor = np.where(np.isnan(factor), float(default), factor) elif isinstance(factors, bool): raise TypeError("factors must be a number, a Series keyed by `on`, or a tidy DataFrame.") elif isinstance(factors, (int, float)): factor = np.full(len(result), float(factors)) else: raise TypeError("factors must be a number, a Series keyed by `on`, or a tidy DataFrame.") applied = factor if how == "multiply" else 1.0 / factor result[out_col or value_col] = result[value_col].to_numpy() * applied if audit_col is not None: prior = result[audit_col].to_numpy() if audit_col in result.columns else np.ones(len(result)) result[audit_col] = prior * applied return result