ExpStats

exp_stats.ExpStats(self, expo, target_status=None, expected=None, wt=None, conf_int=False, credibility=False, conf_level=0.95, cred_r=0.05, col_exposure='exposure')

Experience study summary class

Create a summary of termination experience for a given target status (an ExpStats object).

Typically, the ExpStats class constructor should not be called directly. The preferred method for creating an ExpStats object is to call the exp_stats() method on an ExposedDF object.

Parameters

Name Type Description Default
expo actxps.expose.ExposedDF An exposed data frame class required
target_status str | list | numpy.numpy.ndarray A single string, list, or array of target status values None
expected str | list | numpy.numpy.ndarray Single string, list, or array of column names in the data property of expo with expected values None
wt str Name of the column in the data property of expo containing weights to use in the calculation of claims, exposures, and partial credibility. None
conf_int bool If True, the output will include confidence intervals around the observed termination rates and any actual-to-expected ratios. False
credibility bool Whether the output should include partial credibility weights and credibility-weighted decrement rates. False
conf_level float Confidence level under the Limited Fluctuation credibility method and confidence intervals 0.95
cred_r float Error tolerance under the Limited Fluctuation credibility method 0.05
col_exposure str Name of the column in data containing exposures. 'exposure'

Attributes

Name Type Description
data polars.polars.DataFrame A data frame containing experience study summary results that includes columns for any grouping variables, claims, exposures, and observed decrement rates (q_obs). If any values are passed to expected, additional columns will be added for expected decrements and actual-to-expected ratios. If credibility is set to True, additional columns are added for partial credibility and credibility-weighted decrement rates (assuming values are passed to expected).
target_status, groups, start_date, end_date, expected, wt, xp_params Metadata about the experience study inferred from the ExposedDF object (expo) or passed directly to ExpStats.

Notes

If expo is grouped (see the ExposedDF.group_by() method), the returned ExpStats object’s data will contain one row per group.

If nothing is passed to target_status, the target_status property of expo will be used. If that property is None, all status values except the first level will be assumed. This will produce a warning message.

Expected values

The expected argument is optional. If provided, this argument must be a string, list, or array with values corresponding to columns in expo.data containing expected experience. More than one expected basis can be provided.

Confidence intervals

If conf_int is set to True, the output will contain lower and upper confidence interval limits for the observed termination rate and any actual-to-expected ratios. The confidence level is dictated by conf_level. If no weighting variable is passed to wt, confidence intervals will be constructed assuming a binomial distribution of claims. Otherwise, confidence intervals will be calculated assuming that the aggregate claims distribution is normal with a mean equal to observed claims and a variance equal to:

Var(S) = E(N) * Var(X) + E(X)**2 * Var(N),

Where S is the aggregate claim random variable, X is the weighting variable assumed to follow a normal distribution, and N is a binomial random variable for the number of claims.

If credibility is True and expected values are passed to expected, the output will also contain confidence intervals for any credibility-weighted termination rates.

Credibility

If credibility is set to True, the output will contain a credibility column equal to the partial credibility estimate under the Limited Fluctuation credibility method (also known as Classical Credibility) assuming a binomial distribution of claims.

Alternative class constructor

ExpStats.from_DataFrame() can be used to coerce a data frame containing pre-aggregated experience into an ExpStats object. This is most useful for working with industry study data where individual exposure records are not available.

See Also

Herzog, Thomas (1999). Introduction to Credibility Theory

Methods

Name Description
from_DataFrame Convert a data frame containing aggregate termination experience study
plot Plot experience study results
plot_actual_to_expected Plot actual-to-expected termination rates for any expected termination
plot_termination_rates Plot observed termination rates and any expected termination rates
summary Re-summarize termination experience data
table Tabular experience study summary

from_DataFrame

exp_stats.ExpStats.from_DataFrame(data, target_status=None, expected=None, wt=None, conf_int=False, credibility=False, conf_level=0.95, cred_r=0.05, col_claims='claims', col_exposure='exposure', col_n_claims='n_claims', col_weight_sq='weight_sq', col_weight_n='weight_n', start_date=date(1900, 1, 1), end_date=None)

Convert a data frame containing aggregate termination experience study results to the ExpStats class.

from_DataFrame() is most useful for working with aggregate summaries of experience that were not created by actxps where individual policy information is not available. After converting the data to the ExpStats class, summary() can be used to summarize data by any grouping variables, and plot() and table() are available for reporting.

Parameters

Name Type Description Default
data (polars.polars.DataFrame | pandas.pandas.DataFrame) A DataFrame containing aggregate experience study results. See the Notes section for required columns that must be present. required
target_status str | list | numpy.numpy.ndarray Target status values None
expected str | list | numpy.numpy.ndarray Column names in x with expected values. None
wt str Name of the column in data containing weights to use in the calculation of claims, exposures, partial credibility, and confidence intervals. None
conf_int bool If True, future calls to summary() will include confidence intervals around the observed termination rates and any actual-to-expected ratios. False
credibility bool If True, future calls to summary() will include partial credibility weights and credibility-weighted termination rates. False
conf_level float Confidence level used for the Limited Fluctuation credibility method and confidence intervals. 0.95
cred_r float Error tolerance under the Limited Fluctuation credibility method. 0.05
col_claims str Name of the column in data containing claims. 'claims'
col_exposure str Name of the column in data containing exposures. 'exposure'
col_n_claims str Only used used when wt is passed. Name of the column in data containing the number of claims. 'n_claims'
col_weight_sq default=’weight_sq Only used used when wt is passed. Name of the column in data containing the sum of squared weights. 'weight_sq'
col_weight_n str Only used used when wt is passed. Name of the column in data containing exposure record counts. 'weight_n'
start_date datetime.date | str Experience study start date date(1900, 1, 1)
end_date date | str: default=None Experience study end date None

Returns

Type Description
actxps.exp_stats.ExpStats An ExpStats object

Notes

If nothing is passed to wt, the data frame data must include columns containing:

  • Exposures (exposure)
  • Claim counts (claims)

If wt is passed, the data must include columns containing:

  • Weighted exposures (exposure)
  • Weighted claims (claims)
  • Claim counts (n_claims)
  • The raw sum of weights NOT multiplied by exposures
  • Exposure record counts (.weight_n)
  • The raw sum of squared weights (.weight_sq)

The names in parentheses above are expected column names. If the data frame passed to from_DataFrame() uses different column names, these can be specified using the col_* arguments.

When a column name is passed to wt, the columns .weight, .weight_n, and .weight_sq are used to calculate credibility and confidence intervals. If credibility and confidence intervals aren’t required, then it is not necessary to pass anything to wt. The resulting ExpStats class and any downstream summaries will still be weighted as long as the exposures and claims are pre-weighted.

target_status, start_date, and end_date are optional arguments that are only used for printing the resulting ExpStats object.

Examples

import actxps as xp

# convert pre-aggregated experience into an ExpStats object
agg_sim_dat = xp.load_agg_sim_dat()
dat = xp.ExpStats.from_DataFrame(
    agg_sim_dat,
    col_exposure="exposure_n",
    col_claims="claims_n",
    target_status="Surrender",
    start_date=2005,
    end_date=2019,
    conf_int=True)

# summary by policy year
dat.summary('pol_yr')

# repeat the prior exercise on a weighted basis
dat_wt = xp.ExpStats.from_DataFrame(
    agg_sim_dat, wt="av",
    col_exposure="exposure_amt",
    col_claims="claims_amt",
    col_n_claims="claims_n",
    col_weight_sq="av_sq",
    col_weight_n="n",
    target_status="Surrender",
    start_date=2005, end_date=2019,
    conf_int=True)

# summary by policy year
dat_wt.summary('pol_yr')
Experience study results

Groups: pol_yr
Target status: Surrender
Study range: 2005 to 2019
Weighted by: av

shape: (15, 10)
┌────────┬──────────┬────────────┬────────────┬───┬────────────┬────────────┬───────────┬──────────┐
│ pol_yr ┆ n_claims ┆ claims     ┆ exposure   ┆ … ┆ q_obs_uppe ┆ weight     ┆ weight_sq ┆ weight_n │
│ ---    ┆ ---      ┆ ---        ┆ ---        ┆   ┆ r          ┆ ---        ┆ ---       ┆ ---      │
│ i64    ┆ i64      ┆ f64        ┆ f64        ┆   ┆ ---        ┆ f64        ┆ f64       ┆ i64      │
│        ┆          ┆            ┆            ┆   ┆ f64        ┆            ┆           ┆          │
╞════════╪══════════╪════════════╪════════════╪═══╪════════════╪════════════╪═══════════╪══════════╡
│ 1      ┆ 102      ┆ 83223.0    ┆ 2.5313e7   ┆ … ┆ 0.00465    ┆ 2.6301746e ┆ 6.0743e10 ┆ 19995    │
│        ┆          ┆            ┆            ┆   ┆            ┆ 7          ┆           ┆          │
│ 2      ┆ 160      ┆ 175428.0   ┆ 2.4019e7   ┆ … ┆ 0.009156   ┆ 2.4966449e ┆ 5.9628e10 ┆ 18434    │
│        ┆          ┆            ┆            ┆   ┆            ┆ 7          ┆           ┆          │
│ 3      ┆ 124      ┆ 132261.0   ┆ 2.2435e7   ┆ … ┆ 0.007699   ┆ 2.3442831e ┆ 5.7966e10 ┆ 16806    │
│        ┆          ┆            ┆            ┆   ┆            ┆ 7          ┆           ┆          │
│ 4      ┆ 168      ┆ 192473.0   ┆ 2.0860e7   ┆ … ┆ 0.011547   ┆ 2.1861723e ┆ 5.5707e10 ┆ 15266    │
│        ┆          ┆            ┆            ┆   ┆            ┆ 7          ┆           ┆          │
│ 5      ┆ 164      ┆ 197240.0   ┆ 1.9115e7   ┆ … ┆ 0.012911   ┆ 2.010467e7 ┆ 5.3381e10 ┆ 13618    │
│ …      ┆ …        ┆ …          ┆ …          ┆ … ┆ …          ┆ …          ┆ …         ┆ …        │
│ 11     ┆ 804      ┆ 1.153832e6 ┆ 7.6546e6   ┆ … ┆ 0.167253   ┆ 8.501473e6 ┆ 2.7564e10 ┆ 4897     │
│ 12     ┆ 330      ┆ 517875.0   ┆ 4.8959e6   ┆ … ┆ 0.123667   ┆ 5.713464e6 ┆ 1.9831e10 ┆ 3093     │
│ 13     ┆ 99       ┆ 179358.0   ┆ 3.1348e6   ┆ … ┆ 0.073083   ┆ 3.678776e6 ┆ 1.3003e10 ┆ 1937     │
│ 14     ┆ 62       ┆ 114390.0   ┆ 1.7270e6   ┆ … ┆ 0.090199   ┆ 2.343501e6 ┆ 8.8084e9  ┆ 1182     │
│ 15     ┆ 17       ┆ 26265.0    ┆ 520945.847 ┆ … ┆ 0.092475   ┆ 1.01939e6  ┆ 3.8514e9  ┆ 510      │
│        ┆          ┆            ┆ 496        ┆   ┆            ┆            ┆           ┆          │
└────────┴──────────┴────────────┴────────────┴───┴────────────┴────────────┴───────────┴──────────┘

See Also

ExposedDF.exp_stats() for information on how ExpStats objects are typically created from individual exposure records.

plot

exp_stats.ExpStats.plot(x=None, y='q_obs', color=None, facets=None, mapping=None, scales='fixed', geoms='lines', y_labels=lambda : [f'{v * 100}%' for v in l], y_log10=False, conf_int_bars=False)

Plot experience study results

Parameters

Name Type Description Default
x str A column name in data to use as the x variable. If None, x will default to the first grouping variable. If there are no grouping variables, x will be set to “All”. None
y str A column name in data to use as the y variable. 'q_obs'
color str A column name in data to use as the color and fill variables. If None, y will default to the second grouping variable. If there are less than two grouping variables, the plot will not use a color aesthetic. None
facets list | str Faceting variables in data passed to plotnine.facet_wrap(). If None, grouping variables 3+ will be used (assuming there are more than two grouping variables). None
mapping plotnine.aes Aesthetic mapping added to plotnine.ggplot(). NOTE: If mapping is supplied, the x, y, and color arguments will be ignored. None
scales str The scales argument passed to plotnine.facet_wrap(). 'fixed'
geoms (lines, bars, points) Type of geometry. If “lines” is passed, the plot will display lines and points. If “bars”, the plot will display bars. If “points”, the plot will display points only. 'lines'
y_labels callable Label function passed to plotnine.scale_y_continuous(). lambda l: [f"{v * 100:.1f}%" for v in l]
y_log10 bool If True, the y-axes are plotted on a log-10 scale. False
conf_int_bars bool If True, confidence interval error bars are included in the plot. This option is only available for termination rates and actual-to-expected ratios. False

Notes

If no aesthetic map is supplied, the plot will use the first grouping variable in the groups property on the x axis and q_obs on the y axis. In addition, the second grouping variable in groups will be used for color and fill.

If no faceting variables are supplied, the plot will use grouping variables 3 and up as facets. These variables are passed into plotnine.facet_wrap().

Examples

import actxps as xp

exp_res = (xp.ExposedDF(xp.load_census_dat(),
                        "2019-12-31", 
                        target_status="Surrender").
           group_by('pol_yr').
           exp_stats())

exp_res.plot()

<Figure Size: (640 x 480)>

plot_actual_to_expected

exp_stats.ExpStats.plot_actual_to_expected(add_hline=True, **kwargs)

Plot actual-to-expected termination rates for any expected termination rates found in the expected property.

Parameters

Name Type Description Default
add_hline bool If True, a blue dashed horizontal line will be drawn at 100%. True
**kwargs Additional arguments passed to plot() {}

Examples

import actxps as xp
import numpy as np
import polars as pl        

expo = xp.ExposedDF(xp.load_census_dat(),
                    "2019-12-31", 
                    target_status="Surrender")

expected_table = np.concatenate((np.linspace(0.005, 0.03, 10), 
                                 np.array([0.2, 0.15]), 
                                 np.repeat(0.05, 3)))
expo.data = expo.data.with_columns(
    expected_1=expected_table[expo.data['pol_yr'] - 1],
    expected_2=pl.when(pl.col('inc_guar')).then(0.015).otherwise(0.03)
)

exp_res = (expo.
           group_by('pol_yr').
           exp_stats(expected=['expected_1', 'expected_2']))

exp_res.plot_actual_to_expected()

<Figure Size: (640 x 480)>

plot_termination_rates

exp_stats.ExpStats.plot_termination_rates(include_cred_adj=False, **kwargs)

Plot observed termination rates and any expected termination rates found in the expected property.

Parameters

Name Type Description Default
include_cred_adj bool If True, credibility-weighted termination rates will be plotted as well. False
**kwargs Additional arguments passed to plot() {}

Examples

import actxps as xp
import numpy as np
import polars as pl

expo = xp.ExposedDF(xp.load_census_dat(),
                    "2019-12-31", 
                    target_status="Surrender")

expected_table = np.concatenate((np.linspace(0.005, 0.03, 10), 
                                 np.array([0.2, 0.15]), 
                                 np.repeat(0.05, 3)))
expo.data = expo.data.with_columns(
    expected_1=expected_table[expo.data['pol_yr'] - 1],
    expected_2=pl.when(pl.col('inc_guar')).then(0.015).otherwise(0.03)
)

exp_res = (expo.
           group_by('pol_yr').
           exp_stats(expected=['expected_1', 'expected_2']))

exp_res.plot_termination_rates()

<Figure Size: (640 x 480)>

summary

exp_stats.ExpStats.summary(*by)

Re-summarize termination experience data

Re-summarize the data while retaining any grouping variables passed to the *by argument.

Parameters

Name Type Description Default
*by tuple Quoted column names in data that will be used as grouping variables in the re-summarized object. Passing nothing is acceptable and will produce a 1-row experience summary. ()

Returns

Type Description
actxps.exp_stats.ExpStats A new ExpStats object with rows for all the unique groups in *by

Examples

import actxps as xp

exp_res = (xp.ExposedDF(xp.load_census_dat(),
                        "2019-12-31", 
                        target_status="Surrender").
           group_by('pol_yr', 'inc_guar').
           exp_stats())

exp_res.summary('inc_guar')
Experience study results

Groups: inc_guar
Target status: Surrender
Study range: 1900-01-01 to 2019-12-31

shape: (2, 5)
┌──────────┬──────────┬────────┬──────────────┬──────────┐
│ inc_guar ┆ n_claims ┆ claims ┆ exposure     ┆ q_obs    │
│ ---      ┆ ---      ┆ ---    ┆ ---          ┆ ---      │
│ bool     ┆ u32      ┆ u32    ┆ f64          ┆ f64      │
╞══════════╪══════════╪════════╪══════════════╪══════════╡
│ false    ┆ 1601     ┆ 1601   ┆ 52123.215884 ┆ 0.030716 │
│ true     ┆ 1268     ┆ 1268   ┆ 80510.684872 ┆ 0.015749 │
└──────────┴──────────┴────────┴──────────────┴──────────┘

table

exp_stats.ExpStats.table(fontsize=100, decimals=1, colorful=True, color_q_obs='GnBu', color_ae_='RdBu', show_conf_int=False, show_cred_adj=False, decimals_amt=0, suffix_amt=False, **rename_cols)

Tabular experience study summary

Convert experience study results to a presentation-friendly format.

Parameters

Name Type Description Default
fontsize int Font size percentage multiplier 100
decimals int Number of decimals to display for percentages 1
colorful bool If True, color will be added to the the observed decrement rate and actual-to-expected columns. True
color_q_obs str or colormap ColorBrewer palette used for the observed decrement rate. 'GnBu'
color_ae_ str or colormap ColorBrewer palette used for actual-to-expected rates. 'RdBu'
show_conf_int bool If True any confidence intervals will be displayed. False
show_cred_adj bool If True any credibility-weighted termination rates will be displayed. False
decimals_amt int Number of decimals to display for amount columns (number of claims, claim amounts, and exposures. 0
suffix_amt bool This argument has the same meaning as the compact argument in great_tables.gt.GT.fmt_number() for amount columns. If False, no scaling or suffixing are applied to amount columns. If True, all amount columns are automatically scaled and suffixed by “K” (thousands), “M” (millions), “B” (billions), or “T” (trillions). False
rename_cols str Key-value pairs where keys are column names and values are labels that will appear on the output table. This parameter is useful for renaming grouping variables that will appear under their original variable names if left unchanged. None

Notes

Further customizations can be added using great_tables.gt.GT methods. See the great_tables package documentation for more information.

Returns

Type Description
great_tables.great_tables.gt.great_tables.gt.GT A formatted HTML table

Examples

import actxps as xp
import numpy as np
import polars as pl

expo = xp.ExposedDF(xp.load_census_dat(),
                    "2019-12-31", 
                    target_status="Surrender")

expected_table = np.concatenate((np.linspace(0.005, 0.03, 10), 
                                 np.array([0.2, 0.15]), 
                                 np.repeat(0.05, 3)))
expo.data = expo.data.with_columns(
    expected_1=expected_table[expo.data['pol_yr'] - 1],
    expected_2=pl.when(pl.col('inc_guar')).then(0.015).otherwise(0.03)
)

exp_res = (expo.
           group_by('pol_yr').
           exp_stats(expected=['expected_1', 'expected_2'],
                     credibility=True))

exp_res.table()
Experience Study Results
Target status: Surrender
pol_yr Claims Exposures qobs expected_1 expected_2 Zcred
qexp A/E qexp A/E
1 102 19,252 0.5% 0.5% 106.0% 2.1% 25.2% 25.8%
2 160 17,715 0.9% 0.8% 116.1% 2.1% 43.0% 32.4%
3 124 16,097 0.8% 1.1% 73.0% 2.1% 36.7% 28.5%
4 168 14,536 1.2% 1.3% 86.7% 2.1% 55.1% 33.3%
5 164 12,916 1.3% 1.6% 78.8% 2.1% 60.7% 32.9%
6 152 11,376 1.3% 1.9% 70.7% 2.1% 63.9% 31.7%
7 164 9,917 1.7% 2.2% 76.3% 2.1% 79.1% 32.9%
8 190 8,448 2.2% 2.4% 92.0% 2.1% 107.9% 35.6%
9 181 6,960 2.6% 2.7% 95.5% 2.1% 125.1% 34.8%
10 152 5,604 2.7% 3.0% 90.4% 2.1% 130.6% 31.9%
11 804 4,390 18.3% 20.0% 91.6% 2.1% 881.0% 80.0%
12 330 2,663 12.4% 15.0% 82.6% 2.0% 618.2% 49.5%
13 99 1,620 6.1% 5.0% 122.2% 2.0% 310.9% 26.2%
14 62 872 7.1% 5.0% 142.2% 2.0% 364.3% 20.8%
15 17 268 6.3% 5.0% 126.8% 1.9% 331.2% 10.9%
Study range: 1900-01-01 to 2019-12-31