ExposedDF

expose.ExposedDF(self, data, end_date, start_date=date(1900, 1, 1), target_status=None, cal_expo=False, expo_length='year', col_pol_num='pol_num', col_status='status', col_issue_date='issue_date', col_term_date='term_date', default_status=None)

Exposed data frame class

Convert a data frame of census-level records into an object with exposure-level records.

Parameters

Name Type Description Default
data polars.polars.DataFrame | pandas.pandas.DataFrame A data frame with census-level records required
end_date datetime.date | str Experience study end date. If a string is passed, it must be in %Y-%m-%d format. required
start_date datetime.date | str Experience study start date. If a string is passed, it must be in %Y-%m-%d format. date(1900, 1, 1)
target_status str | list | numpy.numpy.ndarray Target status values None
cal_expo bool Set to True for calendar year exposures. Otherwise policy year exposures are assumed. False
expo_length (year, quarter, month, week) Exposure period length 'year'
col_pol_num str Name of the column in data containing the policy number 'pol_num'
col_status str name of the column in data containing the policy status 'status'
col_issue_date str name of the column in data containing the issue date 'issue_date'
col_term_date str name of the column in data containing the termination date 'term_date'
default_status str Default active status code. If None, the most common status is assumed. None

Attributes

Name Type Description
data polars.polars.DataFrame A Polars data frame with exposure level records. The results include all existing columns in the original input data plus new columns for exposures and observation periods. Observation periods include counters for policy exposures, start dates, and end dates. Both start dates and end dates are inclusive bounds. For policy year exposures, two observation period columns are returned. Columns beginning with (pol_) are integer policy periods. Columns beginning with (pol_date_) are calendar dates representing anniversary dates, monthiversary dates, etc.
end_date, start_date, target_status, cal_expo, expo_length, default_status Values passed on class instantiation. See Parameters for definitions.
exposure_type str A description of the exposure type that combines the cal_expo and expo_length properties
date_cols tuple Names of the start and end date columns in data for each exposure period
trx_types list List of transaction types that have been attached to data using the add_transactions() method.

Notes

Census-level data refers to a data set wherein there is one row per unique policy. Exposure-level data expands census-level data such that there is one record per policy per observation period. Observation periods could be any meaningful period of time such as a policy year, policy month, calendar year, calendar quarter, calendar month, etc.

target_status is used in the calculation of exposures. The annual exposure method is applied, which allocates a full period of exposure for any statuses in target_status. For all other statuses, new entrants and exits are partially exposed based on the time elapsed in the observation period. This method is consistent with the Balducci Hypothesis, which assumes that the probability of termination is proportionate to the time elapsed in the observation period. If the annual exposure method isn’t desired, target_status can be ignored. In this case, partial exposures are always applied regardless of status.

default_status is used to indicate the default active status that should be used when exposure records are created. If None, then the most common status will be assumed.

Alternative class constructors

  • expose_py(), expose_pq(), expose_pm(), expose_pw(), expose_cy(), expose_cq(), expose_cm(), expose_cw()

    Convenience constructor functions for specific exposure calculations. The two characters after the underscore describe the exposure type and exposure period, respectively. For exposures types p refers to policy years c refers to calendar years For exposure periods y = years q = quarters m = months w = weeks Each constructor has the same inputs as the __init__ method except that expo_length and cal_expo arguments are prepopulated.

  • from_DataFrame() Convert a data frame that already has exposure-level records into an ExposedDF object.

References

Atkinson and McGarry (2016). Experience Study Calculations

https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf

Examples

import actxps as xp

xp.ExposedDF(xp.load_toy_census(), "2020-12-31", 
             target_status='Surrender')
Exposure data

Exposure type: policy_year
Target status: Surrender
Study range: 1900-01-01 to 2020-12-31

shape: (33, 8)
┌─────────┬────────┬────────────┬───────────┬────────┬─────────────┬─────────────────┬──────────┐
│ pol_num ┆ status ┆ issue_date ┆ term_date ┆ pol_yr ┆ pol_date_yr ┆ pol_date_yr_end ┆ exposure │
│ ---     ┆ ---    ┆ ---        ┆ ---       ┆ ---    ┆ ---         ┆ ---             ┆ ---      │
│ i64     ┆ enum   ┆ date       ┆ date      ┆ u32    ┆ date        ┆ date            ┆ f64      │
╞═════════╪════════╪════════════╪═══════════╪════════╪═════════════╪═════════════════╪══════════╡
│ 1       ┆ Active ┆ 2010-01-01 ┆ null      ┆ 1      ┆ 2010-01-01  ┆ 2010-12-31      ┆ 1.0      │
│ 1       ┆ Active ┆ 2010-01-01 ┆ null      ┆ 2      ┆ 2011-01-01  ┆ 2011-12-31      ┆ 1.0      │
│ 1       ┆ Active ┆ 2010-01-01 ┆ null      ┆ 3      ┆ 2012-01-01  ┆ 2012-12-31      ┆ 1.0      │
│ 1       ┆ Active ┆ 2010-01-01 ┆ null      ┆ 4      ┆ 2013-01-01  ┆ 2013-12-31      ┆ 1.0      │
│ 1       ┆ Active ┆ 2010-01-01 ┆ null      ┆ 5      ┆ 2014-01-01  ┆ 2014-12-31      ┆ 1.0      │
│ …       ┆ …      ┆ …          ┆ …         ┆ …      ┆ …           ┆ …               ┆ …        │
│ 3       ┆ Active ┆ 2009-11-10 ┆ null      ┆ 8      ┆ 2016-11-10  ┆ 2017-11-09      ┆ 1.0      │
│ 3       ┆ Active ┆ 2009-11-10 ┆ null      ┆ 9      ┆ 2017-11-10  ┆ 2018-11-09      ┆ 1.0      │
│ 3       ┆ Active ┆ 2009-11-10 ┆ null      ┆ 10     ┆ 2018-11-10  ┆ 2019-11-09      ┆ 1.0      │
│ 3       ┆ Active ┆ 2009-11-10 ┆ null      ┆ 11     ┆ 2019-11-10  ┆ 2020-11-09      ┆ 1.0      │
│ 3       ┆ Active ┆ 2009-11-10 ┆ null      ┆ 12     ┆ 2020-11-10  ┆ 2021-11-09      ┆ 0.142466 │
└─────────┴────────┴────────────┴───────────┴────────┴─────────────┴─────────────────┴──────────┘

Methods

Name Description
add_transactions Add transactions to an experience study
exp_stats Summarize experience study records
expose_cm Create an ExposedDF with calendar month exposures
expose_cq Create an ExposedDF with calendar quarter exposures
expose_cw Create an ExposedDF with calendar week exposures
expose_cy Create an ExposedDF with calendar year exposures
expose_pm Create an ExposedDF with policy month exposures
expose_pq Create an ExposedDF with policy quarter exposures
expose_pw Create an ExposedDF with policy week exposures
expose_py Create an ExposedDF with policy year exposures
expose_split Split calendar exposures by policy year
from_DataFrame Coerce a data frame to an ExposedDF object
group_by Set grouping variables for summary methods like exp_stats() and
trx_stats Summarize transactions and utilization rates
ungroup Remove all grouping variables for summary methods like exp_stats()

add_transactions

expose.ExposedDF.add_transactions(trx_data, col_pol_num='pol_num', col_trx_date='trx_date', col_trx_type='trx_type', col_trx_amt='trx_amt')

Add transactions to an experience study

Parameters

Name Type Description Default
trx_data polars.polars.DataFrame | pandas.pandas.DataFrame A data frame containing transactions details. This data frame must have columns for policy numbers, transaction dates, transaction types, and transaction amounts. required
col_pol_num str Name of the column in trx_data containing the policy number 'pol_num'
col_trx_date str Name of the column in trx_data containing the transaction date 'trx_date'
col_trx_type str Name of the column in trx_data containing the transaction type 'trx_type'
col_trx_amt str Name of the column in trx_data containing the transaction amount 'trx_amt'

Notes

This function attaches transactions to an ExposedDF object. Transactions are grouped and summarized such that the number of rows in the data does not change. Two columns are added to the output for each transaction type. These columns have names of the pattern trx_n_{*} (transaction counts) and trx_amt_{*} (transaction_amounts). The trx_types property is updated to include the new transaction types found in trx_data.

Transactions are associated with the data object by matching transactions dates with exposure dates ranges found in the ExposedDF.

Examples

import actxps as xp
census = xp.load_census_dat()
withdrawals = xp.load_withdrawals()
expo = xp.ExposedDF.expose_py(census, "2019-12-31",
                              target_status="Surrender")
expo.add_transactions(withdrawals)
Exposure data

Exposure type: policy_year
Target status: Surrender
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (141_252, 19)
┌─────────┬────────┬────────────┬──────────┬───┬────────────┬────────────┬────────────┬────────────┐
│ pol_num ┆ status ┆ issue_date ┆ inc_guar ┆ … ┆ trx_n_Base ┆ trx_n_Ride ┆ trx_amt_Ba ┆ trx_amt_Ri │
│ ---     ┆ ---    ┆ ---        ┆ ---      ┆   ┆ ---        ┆ r          ┆ se         ┆ der        │
│ i64     ┆ enum   ┆ date       ┆ bool     ┆   ┆ i32        ┆ ---        ┆ ---        ┆ ---        │
│         ┆        ┆            ┆          ┆   ┆            ┆ i32        ┆ f64        ┆ f64        │
╞═════════╪════════╪════════════╪══════════╪═══╪════════════╪════════════╪════════════╪════════════╡
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 0          ┆ 0          ┆ 0.0        ┆ 0.0        │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 0          ┆ 0          ┆ 0.0        ┆ 0.0        │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 0          ┆ 0          ┆ 0.0        ┆ 0.0        │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 0          ┆ 0          ┆ 0.0        ┆ 0.0        │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 0          ┆ 0          ┆ 0.0        ┆ 0.0        │
│ …       ┆ …      ┆ …          ┆ …        ┆ … ┆ …          ┆ …          ┆ …          ┆ …          │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 0          ┆ 1          ┆ 0.0        ┆ 547.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 0          ┆ 1          ┆ 0.0        ┆ 106.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 0          ┆ 1          ┆ 0.0        ┆ 31.0       │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 0          ┆ 1          ┆ 0.0        ┆ 75.0       │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 0          ┆ 1          ┆ 0.0        ┆ 466.0      │
└─────────┴────────┴────────────┴──────────┴───┴────────────┴────────────┴────────────┴────────────┘

exp_stats

expose.ExposedDF.exp_stats(target_status=None, expected=None, wt=None, conf_int=False, credibility=False, conf_level=0.95, cred_r=0.05, col_exposure='exposure')

Summarize experience study records

Create a summary of termination experience for a given target status (an ExpStats object).

Parameters

Name Type Description Default
target_status str | list | numpy.numpy.ndarray A single string, list, or array of target status values None
expected str | list | numpy.numpy.ndarray A single string, list, or array of column names in the data property with expected values None
wt str Name of the column in the data property containing weights to use in the calculation of claims, exposures, and partial credibility. None
conf_int bool If True, the output will include confidence intervals around the observed termination rates and any actual-to-expected ratios. False
credibility bool Whether the output should include partial credibility weights and credibility-weighted decrement rates. False
conf_level float Confidence level under the Limited Fluctuation credibility method 0.95
cred_r float Error tolerance under the Limited Fluctuation credibility method 0.05
col_exposure str Name of the column in data containing exposures. Only necessary for SplitExposedDF objects. 'exposure'

Notes

If the ExposedDF object is grouped (see the group_by() method), the returned ExpStats object’s data will contain one row per group.

If nothing is passed to target_status, the target_status property of the ExposedDF object will be used. If that property is None, all status values except the first level will be assumed. This will produce a warning message.

Expected values

The expected argument is optional. If provided, this argument must be a string, list, or array with values corresponding to columns in the data property containing expected experience. More than one expected basis can be provided.

Confidence intervals

If conf_int is set to True, the output will contain lower and upper confidence interval limits for the observed termination rate and any actual-to-expected ratios. The confidence level is dictated by conf_level. If no weighting variable is passed to wt, confidence intervals will be constructed assuming a binomial distribution of claims. Otherwise, confidence intervals will be calculated assuming that the aggregate claims distribution is normal with a mean equal to observed claims and a variance equal to:

Var(S) = E(N) * Var(X) + E(X)**2 * Var(N),

Where S is the aggregate claim random variable, X is the weighting variable assumed to follow a normal distribution, and N is a binomial random variable for the number of claims.

If credibility is True and expected values are passed to expected, the output will also contain confidence intervals for any credibility-weighted termination rates.

Credibility

If credibility is set to True, the output will contain a credibility column equal to the partial credibility estimate under the Limited Fluctuation credibility method (also known as Classical Credibility) assuming a binomial distribution of claims.

Returns

Type Description
ExpStats An ExpStats object with a data property that includes columns for any grouping variables, claims, exposures, and observed decrement rates (q_obs). If any values are passed to expected, additional columns will be added for expected decrements and actual-to-expected ratios. If credibility is set to True, additional columns are added for partial credibility and credibility-weighted decrement rates (assuming values are passed to expected). If conf_int is set to True, additional columns are added for lower and upper confidence interval limits around the observed termination rates and any actual-to-expected ratios. Additionally, if credibility is True and expected values are passed to expected, the output will contain confidence intervals around credibility-weighted termination rates. Confidence interval columns include the name of the original output column suffixed by either _lower or _upper. If a value is passed to wt, additional columns are created containing the the sum of weights (.weight), the sum of squared weights (.weight_qs), and the number of records (.weight_n).

References

Herzog, Thomas (1999). Introduction to Credibility Theory

Examples

import actxps as xp

(xp.ExposedDF(xp.load_census_dat(),
              "2019-12-31", 
              target_status="Surrender").
    group_by('pol_yr', 'inc_guar').
    exp_stats(conf_int=True))
Experience study results

Groups: pol_yr, inc_guar
Target status: Surrender
Study range: 1900-01-01 to 2019-12-31

shape: (30, 8)
┌────────┬──────────┬──────────┬────────┬──────────────┬──────────┬─────────────┬─────────────┐
│ pol_yr ┆ inc_guar ┆ n_claims ┆ claims ┆ exposure     ┆ q_obs    ┆ q_obs_lower ┆ q_obs_upper │
│ ---    ┆ ---      ┆ ---      ┆ ---    ┆ ---          ┆ ---      ┆ ---         ┆ ---         │
│ u32    ┆ bool     ┆ u32      ┆ u32    ┆ f64          ┆ f64      ┆ f64         ┆ f64         │
╞════════╪══════════╪══════════╪════════╪══════════════╪══════════╪═════════════╪═════════════╡
│ 1      ┆ false    ┆ 56       ┆ 56     ┆ 7719.80545   ┆ 0.007254 ┆ 0.005441    ┆ 0.009197    │
│ 1      ┆ true     ┆ 46       ┆ 46     ┆ 11532.402336 ┆ 0.003989 ┆ 0.002862    ┆ 0.005203    │
│ 2      ┆ false    ┆ 92       ┆ 92     ┆ 7102.810869  ┆ 0.012953 ┆ 0.010418    ┆ 0.015628    │
│ 2      ┆ true     ┆ 68       ┆ 68     ┆ 10611.955805 ┆ 0.006408 ┆ 0.0049      ┆ 0.00801     │
│ 3      ┆ false    ┆ 67       ┆ 67     ┆ 6446.913856  ┆ 0.010393 ┆ 0.008066    ┆ 0.012874    │
│ …      ┆ …        ┆ …        ┆ …      ┆ …            ┆ …        ┆ …           ┆ …           │
│ 13     ┆ true     ┆ 49       ┆ 49     ┆ 1117.137361  ┆ 0.043862 ┆ 0.032225    ┆ 0.056394    │
│ 14     ┆ false    ┆ 33       ┆ 33     ┆ 262.622262   ┆ 0.125656 ┆ 0.087578    ┆ 0.167541    │
│ 14     ┆ true     ┆ 29       ┆ 29     ┆ 609.216476   ┆ 0.047602 ┆ 0.031188    ┆ 0.065658    │
│ 15     ┆ false    ┆ 8        ┆ 8      ┆ 74.046456    ┆ 0.10804  ┆ 0.040515    ┆ 0.18907     │
│ 15     ┆ true     ┆ 9        ┆ 9      ┆ 194.128602   ┆ 0.046361 ┆ 0.020605    ┆ 0.077268    │
└────────┴──────────┴──────────┴────────┴──────────────┴──────────┴─────────────┴─────────────┘

expose_cm

expose.ExposedDF.expose_cm(data, end_date, **kwargs)

Create an ExposedDF with calendar month exposures

expose_cq

expose.ExposedDF.expose_cq(data, end_date, **kwargs)

Create an ExposedDF with calendar quarter exposures

expose_cw

expose.ExposedDF.expose_cw(data, end_date, **kwargs)

Create an ExposedDF with calendar week exposures

expose_cy

expose.ExposedDF.expose_cy(data, end_date, **kwargs)

Create an ExposedDF with calendar year exposures

expose_pm

expose.ExposedDF.expose_pm(data, end_date, **kwargs)

Create an ExposedDF with policy month exposures

expose_pq

expose.ExposedDF.expose_pq(data, end_date, **kwargs)

Create an ExposedDF with policy quarter exposures

expose_pw

expose.ExposedDF.expose_pw(data, end_date, **kwargs)

Create an ExposedDF with policy week exposures

expose_py

expose.ExposedDF.expose_py(data, end_date, **kwargs)

Create an ExposedDF with policy year exposures

expose_split

expose.ExposedDF.expose_split()

Split calendar exposures by policy year

Split calendar period exposures that cross a policy anniversary into a pre-anniversary record and a post-anniversary record.

Returns

Type Description
SplitExposedDF A subclass of ExposedDF with calendar period exposures split by policy year.

Notes

The ExposedDF must have calendar year, quarter, month, or week exposure records. Calendar year exposures are created by passing cal_expo=True to ExposedDF (or alternatively, with the class methods ExposedDF.expose_cy(), ExposedDF.expose_cq(), ExposedDF.expose_cm(), and ExposedDF.expose_cw()).

After splitting, the resulting data will contain both calendar exposures and policy year exposures. These columns will be named ‘exposure_cal’ and ‘exposure_pol’, respectively. Calendar exposures will be in the original units passed to SplitExposedDF(). Policy exposures will always be expressed in years. Downstream functions like exp_stats() and exp_shiny() will require clarification as to which exposure basis should be used to summarize results.

After splitting, the column ‘pol_yr’ will contain policy years.

Examples

import actxps as xp
toy_census = xp.load_toy_census()
expo = xp.ExposedDF.expose_cy(toy_census, "2022-12-31")
expo.expose_split()
Exposure data

Exposure type: split_year
Target status: None
Study range: 1900-01-01 to 2022-12-31

shape: (58, 9)
┌─────────┬───────────┬────────────┬────────────┬───┬────────────┬────────┬────────────┬───────────┐
│ pol_num ┆ status    ┆ issue_date ┆ term_date  ┆ … ┆ cal_yr_end ┆ pol_yr ┆ exposure_p ┆ exposure_ │
│ ---     ┆ ---       ┆ ---        ┆ ---        ┆   ┆ ---        ┆ ---    ┆ ol         ┆ cal       │
│ i64     ┆ enum      ┆ date       ┆ date       ┆   ┆ date       ┆ i32    ┆ ---        ┆ ---       │
│         ┆           ┆            ┆            ┆   ┆            ┆        ┆ f64        ┆ f64       │
╞═════════╪═══════════╪════════════╪════════════╪═══╪════════════╪════════╪════════════╪═══════════╡
│ 1       ┆ Active    ┆ 2010-01-01 ┆ null       ┆ … ┆ 2010-12-31 ┆ 1      ┆ 1.0        ┆ 1.0       │
│ 1       ┆ Active    ┆ 2010-01-01 ┆ null       ┆ … ┆ 2011-12-31 ┆ 2      ┆ 1.0        ┆ 1.0       │
│ 1       ┆ Active    ┆ 2010-01-01 ┆ null       ┆ … ┆ 2012-12-31 ┆ 3      ┆ 1.0        ┆ 1.0       │
│ 1       ┆ Active    ┆ 2010-01-01 ┆ null       ┆ … ┆ 2013-12-31 ┆ 4      ┆ 1.0        ┆ 1.0       │
│ 1       ┆ Active    ┆ 2010-01-01 ┆ null       ┆ … ┆ 2014-12-31 ┆ 5      ┆ 1.0        ┆ 1.0       │
│ …       ┆ …         ┆ …          ┆ …          ┆ … ┆ …          ┆ …      ┆ …          ┆ …         │
│ 3       ┆ Active    ┆ 2009-11-10 ┆ null       ┆ … ┆ 2020-11-09 ┆ 11     ┆ 0.857923   ┆ 0.857923  │
│ 3       ┆ Active    ┆ 2009-11-10 ┆ null       ┆ … ┆ 2020-12-31 ┆ 12     ┆ 0.142466   ┆ 0.142077  │
│ 3       ┆ Active    ┆ 2009-11-10 ┆ null       ┆ … ┆ 2021-11-09 ┆ 12     ┆ 0.857534   ┆ 0.857534  │
│ 3       ┆ Active    ┆ 2009-11-10 ┆ null       ┆ … ┆ 2021-12-31 ┆ 13     ┆ 0.142466   ┆ 0.142466  │
│ 3       ┆ Surrender ┆ 2009-11-10 ┆ 2022-02-25 ┆ … ┆ 2022-11-09 ┆ 13     ┆ 0.153425   ┆ 0.153425  │
└─────────┴───────────┴────────────┴────────────┴───┴────────────┴────────┴────────────┴───────────┘

See Also

SplitExposedDF() for full information on SplitExposedDF class.

from_DataFrame

expose.ExposedDF.from_DataFrame(data, end_date, start_date=date(1900, 1, 1), target_status=None, cal_expo=False, expo_length='year', trx_types=None, col_pol_num='pol_num', col_status='status', col_exposure='exposure', col_pol_per=None, cols_dates=None, col_trx_n_='trx_n_', col_trx_amt_='trx_amt_', default_status=None)

Coerce a data frame to an ExposedDF object

The input data frame must have columns for policy numbers, statuses, exposures, policy periods (for policy exposures only), and exposure start / end dates. Optionally, if data has transaction counts and amounts by type, these can be specified without calling add_transactions().

Parameters

Name Type Description Default
data polars.polars.DataFrame | pandas.pandas.DataFrame A data frame with exposure-level records required
end_date datetime.date | str Experience study end date required
start_date datetime.date | str Experience study start date date(1900, 1, 1)
target_status str | list | numpy.numpy.ndarray Target status values None
cal_expo bool Set to True for calendar year exposures. Otherwise policy year exposures are assumed. False
expo_length str Exposure period length. Must be ‘year’, ‘quarter’, ‘month’, or ‘week’ 'year'
trx_types list | str List containing unique transaction types that have been attached to data. For each value in trx_types, from_DataFrame requires that columns exist in data named trx_n_{*} and trx_amt_{*} containing transaction counts and amounts, respectively. The prefixes “trx_n_” and “trx_amt_” can be overridden using the col_trx_n_ and col_trx_amt_ arguments. None
col_pol_num str Name of the column in data containing the policy number 'pol_num'
col_status str name of the column in data containing the policy status 'status'
col_exposure str Name of the column in data containing exposures. 'exposure'
col_pol_per str Name of the column in data containing policy exposure periods. Only necessary if cal_expo is False. The assumed default is either “pol_yr”, “pol_qtr”, “pol_mth”, or “pol_wk” depending on the value of expo_length. None
cols_dates str Names of the columns in data containing exposure start and end dates. Both date ranges are assumed to be exclusive. The assumed default is of the form A_B. A is “cal” if cal_expo is True or “pol” otherwise. B is either “yr”, “qtr”, “mth”, or “wk” depending on the value of expo_length. None
col_trx_n_ str Prefix to use for columns containing transaction counts. "trx_n_"
col_trx_amt_ str Prefix to use for columns containing transaction amounts. "trx_amt_"
default_status str Default active status code None

Returns

Type Description
actxps.expose.ExposedDF An ExposedDF object.

group_by

expose.ExposedDF.group_by(*by)

Set grouping variables for summary methods like exp_stats() and trx_stats().

Parameters

Name Type Description Default
*by Column names in data that will be used as grouping variables ()

Notes

This function will not directly apply the DataFrame.group_by() method to the data property. Instead, it will set the groups property of the ExposedDF object. The groups property is subsequently used to group data within summary methods like exp_stats() and trx_stats().

trx_stats

expose.ExposedDF.trx_stats(trx_types=None, percent_of=None, combine_trx=False, full_exposures_only=True, conf_int=False, conf_level=0.95, col_exposure='exposure')

Summarize transactions and utilization rates

Create a summary of transaction counts, amounts, and utilization rates (a TrxStats object).

Parameters

Name Type Description Default
trx_types list or str A list of transaction types to include in the output. If None is provided, all available transaction types in the trx_types property will be used. None
percent_of list or str A list containing column names in the data property to use as denominators in the calculation of utilization rates or actual-to-expected ratios. None
combine_trx bool If False (default), the results will contain output rows for each transaction type. If True, the results will contains aggregated results across all transaction types. False
full_exposures_only bool If True (default), partially exposed records will be ignored in the results. True
conf_int bool If True, the output will include confidence intervals around the observed utilization rate and any percent_of output columns. False
conf_level float Confidence level for confidence intervals 0.95
col_exposure str Name of the column in the data property containing exposures. Only necessary for SplitExposedDF objects. 'exposure'

Notes

If the ExposedDF object is grouped (see the group_by() method), the returned TrxStats object’s data will contain one row per group.

Any number of transaction types can be passed to the trx_types argument, however each transaction type must appear in the trx_types property of the ExposedDF object. In addition, trx_stats() expects to see columns named trx_n_{*} (for transaction counts) and trx_amt_{*} for (transaction amounts) for each transaction type. To ensure data is in the appropriate format, use the class method ExposedDF.from_DataFrame() to convert an existing data frame with transactions or use add_transactions() to attach transactions to an existing ExposedDF object.

“Percentage of” calculations

The percent_of argument is optional. If provided, this argument must be list with values corresponding to columns in the data property containing values to use as denominators in the calculation of utilization rates or actual-to-expected ratios. Example usage:

  • In a study of partial withdrawal transactions, if percent_of refers to account values, observed withdrawal rates can be determined.
  • In a study of recurring claims, if percent_of refers to a column containing a maximum benefit amount, utilization rates can be determined.

Confidence intervals

If conf_int is set to True, the output will contain lower and upper confidence interval limits for the observed utilization rate and any percent_of output columns. The confidence level is dictated by conf_level.

  • Intervals for the utilization rate (trx_util) assume a binomial distribution.

  • Intervals for transactions as a percentage of another column with non-zero transactions (pct_of_{*}_w_trx) are constructed using a normal distribution

  • Intervals for transactions as a percentage of another column regardless of transaction utilization (pct_of_{*}_all) are calculated assuming that the aggregate distribution is normal with a mean equal to observed transactions and a variance equal to:

    Var(S) = E(N) * Var(X) + E(X)**2 * Var(N),

Where S is the aggregate transactions random variable, X is an individual transaction amount assumed to follow a normal distribution, and N is a binomial random variable for transaction utilization.

Default removal of partial exposures

As a default, partial exposures are removed from data before summarizing results. This is done to avoid complexity associated with a lopsided skew in the timing of transactions. For example, if transactions can occur on a monthly basis or annually at the beginning of each policy year, partial exposures may not be appropriate. If a policy had an exposure of 0.5 years and was taking withdrawals annually at the beginning of the year, an argument could be made that the exposure should instead be 1 complete year. If the same policy was expected to take withdrawals 9 months into the year, it’s not clear if the exposure should be 0.5 years or 0.5 / 0.75 years. To override this treatment, set full_exposures_only to False.

Returns

Type Description
TrxStats A TrxStats object with a data property that includes columns for any grouping variables and transaction types, plus the following: - trx_n: the number of unique transactions. - trx_amt: total transaction amount - trx_flag: the number of observation periods with non-zero transaction amounts. - exposure: total exposures - avg_trx: mean transaction amount (trx_amt / trx_flag) - avg_all: mean transaction amount over all records (trx_amt / exposure) - trx_freq: transaction frequency when a transaction occurs (trx_n / trx_flag) - trx_utilization: transaction utilization per observation period (trx_flag / exposure) If percent_of is provided, the results will also include: - The sum of any columns passed to percent_of with non-zero transactions. These columns include the suffix _w_trx. - The sum of any columns passed to percent_of - pct_of_{*}_w_trx: total transactions as a percentage of column {*}_w_trx - pct_of_{*}_all: total transactions as a percentage of column {*} If conf_int is set to True, additional columns are added for lower and upper confidence interval limits around the observed utilization rate and any percent_of output columns. Confidence interval columns include the name of the original output column suffixed by either _lower or _upper. If values are passed to percent_of, an additional column is created containing the the sum of squared transaction amounts (trx_amt_sq).

Examples

import actxps as xp
census = xp.load_census_dat()
withdrawals = xp.load_withdrawals()
expo = xp.ExposedDF.expose_py(census, "2019-12-31",
                              target_status="Surrender")
expo.add_transactions(withdrawals)

expo.group_by('inc_guar').trx_stats(percent_of="premium",
                                    combine_trx=True,
                                    conf_int=True)
Transaction study results

Groups: inc_guar
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider
Transactions as % of: premium

shape: (2, 21)
┌──────────┬──────────┬─────────┬──────────┬───┬────────────┬────────────┬────────────┬────────────┐
│ inc_guar ┆ trx_type ┆ trx_n   ┆ trx_flag ┆ … ┆ pct_of_pre ┆ pct_of_pre ┆ pct_of_pre ┆ pct_of_pre │
│ ---      ┆ ---      ┆ ---     ┆ ---      ┆   ┆ mium_w_trx ┆ mium_w_trx ┆ mium_all_l ┆ mium_all_u │
│ bool     ┆ str      ┆ f64     ┆ u32      ┆   ┆ _lower     ┆ _upper     ┆ ower       ┆ pper       │
│          ┆          ┆         ┆          ┆   ┆ ---        ┆ ---        ┆ ---        ┆ ---        │
│          ┆          ┆         ┆          ┆   ┆ f64        ┆ f64        ┆ f64        ┆ f64        │
╞══════════╪══════════╪═════════╪══════════╪═══╪════════════╪════════════╪════════════╪════════════╡
│ false    ┆ All      ┆ 52939.0 ┆ 24703    ┆ … ┆ 0.027557   ┆ 0.028621   ┆ 0.014253   ┆ 0.014861   │
│ true     ┆ All      ┆ 84882.0 ┆ 39462    ┆ … ┆ 0.055607   ┆ 0.057363   ┆ 0.029064   ┆ 0.030067   │
└──────────┴──────────┴─────────┴──────────┴───┴────────────┴────────────┴────────────┴────────────┘

ungroup

expose.ExposedDF.ungroup()

Remove all grouping variables for summary methods like exp_stats() and trx_stats().