Transaction studies

This article walks through an example of creating a transaction study using the actxps package. Unlike a termination study, transaction studies track events that can occur multiple times over the life of a policy. Often, transactions are expected to reoccur; for example, the utilization of a guaranteed income stream.

Key questions to answer in a transaction study are:

The example below walks through preparing data by adding transaction information to an ExposedDF object using the add_transactions() method. Next, study results are summarized using the trx_stats() method.

Simulated transaction and account value data

In this example, we’ll be using the census_dat, withdrawals, and account_vals data sets. Each data set is based on a theoretical block of deferred annuity business with a guaranteed lifetime income benefit.

  • census_dat contains census-level information with one row per policy
  • withdrawals contains withdrawal transactions. There are 2 types of transactions in the data: “Base” (ordinary withdrawals) and “Rider” (guaranteed income payments).
  • account_vals contains historical account values on policy anniversaries. This data will be used to calculate withdrawal rates as a percentage of account values.

The add_transactions() method

The add_transactions() method attaches transactions to an ExposedDF object, which contains a data frame with exposure-level records. For our example, we first need to convert census_dat into exposure records using ExposedDF().1 This example will use policy year exposures.

import actxps as xp
import polars as pl

census_dat = xp.load_census_dat()
exposed_data = xp.ExposedDF(census_dat, "2019-12-31",
                            target_status="Surrender")
exposed_data
Exposure data

Exposure type: policy_year
Target status: Surrender
Study range: 1900-01-01 to 2019-12-31

shape: (141_252, 15)
┌─────────┬────────┬────────────┬──────────┬───┬────────┬─────────────┬─────────────────┬──────────┐
│ pol_num ┆ status ┆ issue_date ┆ inc_guar ┆ … ┆ pol_yr ┆ pol_date_yr ┆ pol_date_yr_end ┆ exposure │
│ ---     ┆ ---    ┆ ---        ┆ ---      ┆   ┆ ---    ┆ ---         ┆ ---             ┆ ---      │
│ i64     ┆ enum   ┆ date       ┆ bool     ┆   ┆ u32    ┆ date        ┆ date            ┆ f64      │
╞═════════╪════════╪════════════╪══════════╪═══╪════════╪═════════════╪═════════════════╪══════════╡
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 1      ┆ 2014-12-17  ┆ 2015-12-16      ┆ 1.0      │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 2      ┆ 2015-12-17  ┆ 2016-12-16      ┆ 1.0      │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 3      ┆ 2016-12-17  ┆ 2017-12-16      ┆ 1.0      │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 4      ┆ 2017-12-17  ┆ 2018-12-16      ┆ 1.0      │
│ 1       ┆ Active ┆ 2014-12-17 ┆ true     ┆ … ┆ 5      ┆ 2018-12-17  ┆ 2019-12-16      ┆ 1.0      │
│ …       ┆ …      ┆ …          ┆ …        ┆ … ┆ …      ┆ …           ┆ …               ┆ …        │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 7      ┆ 2015-04-29  ┆ 2016-04-28      ┆ 1.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 8      ┆ 2016-04-29  ┆ 2017-04-28      ┆ 1.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 9      ┆ 2017-04-29  ┆ 2018-04-28      ┆ 1.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 10     ┆ 2018-04-29  ┆ 2019-04-28      ┆ 1.0      │
│ 20000   ┆ Active ┆ 2009-04-29 ┆ true     ┆ … ┆ 11     ┆ 2019-04-29  ┆ 2020-04-28      ┆ 0.674863 │
└─────────┴────────┴────────────┴──────────┴───┴────────┴─────────────┴─────────────────┴──────────┘

The withdrawals data has 4 columns that are required for attaching transactions:

  • pol_num: policy number
  • trx_date: transaction date
  • trx_type: transaction type
  • trx_amt: transaction amount
withdrawals = xp.load_withdrawals()
withdrawals
shape: (160_130, 4)
pol_num trx_date trx_type trx_amt
i64 date cat f64
2 2007-10-05 "Base" 25.0
2 2009-07-30 "Base" 12.0
2 2010-02-22 "Base" 7.0
2 2010-12-30 "Base" 52.0
2 2012-05-07 "Base" 41.0
20000 2015-08-08 "Rider" 547.0
20000 2016-07-26 "Rider" 106.0
20000 2017-12-29 "Rider" 31.0
20000 2018-06-14 "Rider" 75.0
20000 2019-12-09 "Rider" 466.0

The grain of this data is one row per policy per transaction. The expectation is that the number of records in the transaction data will not match the number of rows in the exposure data. That is because policies could have zero or several transactions in a given exposure period.

The add_transactions() method uses a non-equivalent join to associate each transaction with a policy number and a date interval found in the exposure data. Then, transaction counts and amounts are summarized such that there is one row per exposure period. In the event there are multiple transaction types found in the data, separate columns are created for each transaction type.

Using our example, we pass both the exposure and withdrawals data to add_transactions(). The resulting data has the same number of rows as original exposure data and 4 new columns:

  • trx_amt_Base: the sum of “Base” withdrawal transactions
  • trx_amt_Rider: the sum of “Rider” withdrawal transactions
  • trx_n_Base: the number of “Base” withdrawal transactions
  • trx_n_Rider: the number of “Rider” withdrawal transactions
exposed_data.add_transactions(withdrawals)
exposed_data.data.glimpse()
Rows: 141252
Columns: 19
$ pol_num          <i64> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2
$ status          <enum> Active, Active, Active, Active, Active, Active, Active, Active, Active, Active
$ issue_date      <date> 2014-12-17, 2014-12-17, 2014-12-17, 2014-12-17, 2014-12-17, 2014-12-17, 2007-09-24, 2007-09-24, 2007-09-24, 2007-09-24
$ inc_guar        <bool> True, True, True, True, True, True, False, False, False, False
$ qual            <bool> False, False, False, False, False, False, False, False, False, False
$ age              <i64> 56, 56, 56, 56, 56, 56, 71, 71, 71, 71
$ product          <cat> b, b, b, b, b, b, a, a, a, a
$ gender           <cat> F, F, F, F, F, F, F, F, F, F
$ wd_age           <i64> 77, 77, 77, 77, 77, 77, 71, 71, 71, 71
$ premium          <f64> 370.0, 370.0, 370.0, 370.0, 370.0, 370.0, 708.0, 708.0, 708.0, 708.0
$ term_date       <date> None, None, None, None, None, None, None, None, None, None
$ pol_yr           <u32> 1, 2, 3, 4, 5, 6, 1, 2, 3, 4
$ pol_date_yr     <date> 2014-12-17, 2015-12-17, 2016-12-17, 2017-12-17, 2018-12-17, 2019-12-17, 2007-09-24, 2008-09-24, 2009-09-24, 2010-09-24
$ pol_date_yr_end <date> 2015-12-16, 2016-12-16, 2017-12-16, 2018-12-16, 2019-12-16, 2020-12-16, 2008-09-23, 2009-09-23, 2010-09-23, 2011-09-23
$ exposure         <f64> 1.0, 1.0, 1.0, 1.0, 1.0, 0.040983606557377046, 1.0, 1.0, 1.0, 1.0
$ trx_n_Base       <i32> 0, 0, 0, 0, 0, 0, 1, 1, 1, 1
$ trx_n_Rider      <i32> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ trx_amt_Base     <f64> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 25.0, 12.0, 7.0, 52.0
$ trx_amt_Rider    <f64> 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0

If we print exposed_data, we can see that it has an additional attribute for transaction types that have been attached.

exposed_data

The trx_stats() method

The actxps package’s workhorse function for summarizing transaction experience is the trx_stats() method of the ExposedDF class. This function returns a TrxStats object, which is a type of data frame containing additional attributes about the transaction study.

At a minimum, a TrxStats includes the following for each transaction type (trx_type):

  • The number of transactions (trx_n)
  • The number of exposure periods with a transaction (trx_flag)
  • The sum of transactions (trx_amt)
  • The total exposure (exposure)
  • The average transaction amount when a transaction occurs (avg_trx)
  • The average transaction amount across all records (avg_all)
  • The transaction frequency when a transaction occurs (trx_freq = trx_n / trx_flag)
  • The transaction utilization (trx_util = trx_flag / exposure)

Optionally, a TrxStats can also include:

  • Any grouping variables attached to the input data
  • Transaction amounts as a percentage of another value when a transaction occurs (pct_of_*_w_trx)
  • Transaction amounts as a percentage of another value across all records (pct_of_*_all)

To use trx_stats(), we simply need to call the method from an ExposedDF object with transactions attached.

exposed_data.trx_stats()
Transaction study results

Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (2, 9)
┌──────────┬─────────┬──────────┬────────────┬───┬───────────┬───────────┬──────────┬──────────┐
│ trx_type ┆ trx_n   ┆ trx_flag ┆ trx_amt    ┆ … ┆ avg_trx   ┆ avg_all   ┆ trx_freq ┆ trx_util │
│ ---      ┆ ---     ┆ ---      ┆ ---        ┆   ┆ ---       ┆ ---       ┆ ---      ┆ ---      │
│ str      ┆ f64     ┆ u32      ┆ f64        ┆   ┆ f64       ┆ f64       ┆ f64      ┆ f64      │
╞══════════╪═════════╪══════════╪════════════╪═══╪═══════════╪═══════════╪══════════╪══════════╡
│ Base     ┆ 60500.0 ┆ 28224    ┆ 1.093899e6 ┆ … ┆ 38.757759 ┆ 8.809475  ┆ 2.143566 ┆ 0.227296 │
│ Rider    ┆ 77321.0 ┆ 35941    ┆ 2.842729e6 ┆ … ┆ 79.094321 ┆ 22.893294 ┆ 2.151331 ┆ 0.289443 │
└──────────┴─────────┴──────────┴────────────┴───┴───────────┴───────────┴──────────┴──────────┘

The results show us that we specified no groups, which is why the output data contains a single row for each transaction type.

Grouped data

If the data is grouped using the group_by() method, future calls to exp_stats() will contain one record for each unique group.

If the data is grouped using the group_by() method, future calls to trx_stats() will contain one record for each unique group.

In the following, exposed_data is grouped by the presence of an income guarantee (inc_guar) before being passed to trx_stats(). This results in four rows because we have two types of transactions and two distinct values of inc_guar.

(exposed_data.
    group_by('inc_guar').
    trx_stats())
Transaction study results

Groups: inc_guar
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (4, 10)
┌──────────┬──────────┬─────────┬──────────┬───┬───────────┬───────────┬──────────┬──────────┐
│ inc_guar ┆ trx_type ┆ trx_n   ┆ trx_flag ┆ … ┆ avg_trx   ┆ avg_all   ┆ trx_freq ┆ trx_util │
│ ---      ┆ ---      ┆ ---     ┆ ---      ┆   ┆ ---       ┆ ---       ┆ ---      ┆ ---      │
│ bool     ┆ str      ┆ f64     ┆ u32      ┆   ┆ f64       ┆ f64       ┆ f64      ┆ f64      │
╞══════════╪══════════╪═════════╪══════════╪═══╪═══════════╪═══════════╪══════════╪══════════╡
│ false    ┆ Base     ┆ 52939.0 ┆ 24703    ┆ … ┆ 38.563292 ┆ 19.466039 ┆ 2.143019 ┆ 0.504782 │
│ false    ┆ Rider    ┆ 0.0     ┆ 0        ┆ … ┆ NaN       ┆ 0.0       ┆ NaN      ┆ 0.0      │
│ true     ┆ Base     ┆ 7561.0  ┆ 3521     ┆ … ┆ 40.122124 ┆ 1.877716  ┆ 2.147401 ┆ 0.0468   │
│ true     ┆ Rider    ┆ 77321.0 ┆ 35941    ┆ … ┆ 79.094321 ┆ 37.784661 ┆ 2.151331 ┆ 0.477716 │
└──────────┴──────────┴─────────┴──────────┴───┴───────────┴───────────┴──────────┴──────────┘

Multiple grouping variables are allowed. Below, policy year (pol_yr) is added as a second grouping variable.

(exposed_data.
    group_by('inc_guar', 'pol_yr').
    trx_stats())
Transaction study results

Groups: inc_guar, pol_yr
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (60, 11)
┌──────────┬────────┬──────────┬────────┬───┬────────────┬───────────┬──────────┬──────────┐
│ inc_guar ┆ pol_yr ┆ trx_type ┆ trx_n  ┆ … ┆ avg_trx    ┆ avg_all   ┆ trx_freq ┆ trx_util │
│ ---      ┆ ---    ┆ ---      ┆ ---    ┆   ┆ ---        ┆ ---       ┆ ---      ┆ ---      │
│ bool     ┆ u32    ┆ str      ┆ f64    ┆   ┆ f64        ┆ f64       ┆ f64      ┆ f64      │
╞══════════╪════════╪══════════╪════════╪═══╪════════════╪═══════════╪══════════╪══════════╡
│ false    ┆ 1      ┆ Base     ┆ 6077.0 ┆ … ┆ 34.115585  ┆ 13.219502 ┆ 2.109337 ┆ 0.387492 │
│ false    ┆ 1      ┆ Rider    ┆ 0.0    ┆ … ┆ NaN        ┆ 0.0       ┆ NaN      ┆ 0.0      │
│ false    ┆ 2      ┆ Base     ┆ 6091.0 ┆ … ┆ 34.374083  ┆ 14.444885 ┆ 2.127489 ┆ 0.420226 │
│ false    ┆ 2      ┆ Rider    ┆ 0.0    ┆ … ┆ NaN        ┆ 0.0       ┆ NaN      ┆ 0.0      │
│ false    ┆ 3      ┆ Base     ┆ 6016.0 ┆ … ┆ 34.584074  ┆ 15.752105 ┆ 2.138642 ┆ 0.455473 │
│ …        ┆ …      ┆ …        ┆ …      ┆ … ┆ …          ┆ …         ┆ …        ┆ …        │
│ true     ┆ 13     ┆ Rider    ┆ 1486.0 ┆ … ┆ 94.984894  ┆ 71.698974 ┆ 2.244713 ┆ 0.754846 │
│ true     ┆ 14     ┆ Base     ┆ 26.0   ┆ … ┆ 83.8       ┆ 2.100251  ┆ 2.6      ┆ 0.025063 │
│ true     ┆ 14     ┆ Rider    ┆ 736.0  ┆ … ┆ 96.987342  ┆ 76.81203  ┆ 2.329114 ┆ 0.79198  │
│ true     ┆ 15     ┆ Base     ┆ 0.0    ┆ … ┆ NaN        ┆ 0.0       ┆ NaN      ┆ 0.0      │
│ true     ┆ 15     ┆ Rider    ┆ 19.0   ┆ … ┆ 104.888889 ┆ 85.818182 ┆ 2.111111 ┆ 0.818182 │
└──────────┴────────┴──────────┴────────┴───┴────────────┴───────────┴──────────┴──────────┘

Expressing transactions as a percentage of another value

In a transaction study, we often want to express transaction amounts as a percentage of another value. For example, in a withdrawal study, withdrawal amounts divided by account values provides a withdrawal rate. In a study of benefit utilization, transactions can be divided by a maximum benefit amount to derive a benefit utilization rate. In addition, actual-to-expected rates can be calculated by dividing transactions by expected values.

If column names are passed to the percent_of argument of trx_stats(), the output will contain 4 additional columns for each “percent of” variable:

  • The sum of each “percent of” variable
  • The sum of each “percent of” variable when a transaction occurs. These columns include the suffix _w_trx.
  • Transaction amounts divided by each “percent of” variable (pct_of_{*}_all)
  • Transaction amounts divided by each “percent of” variable when a transaction occurs (pct_of_{*}_w_trx)

For our example, let’s assume we’re interested in examining withdrawal transactions as a percentage of account values, which are available in the account_vals data frame in the column av_anniv.

# attach account values data
account_vals = xp.load_account_vals()
exposed_data.data = \
    exposed_data.data.join(account_vals,
                           on=['pol_num', 'pol_date_yr'],
                           how='left')

trx_res = (exposed_data.
           group_by('pol_yr', 'inc_guar').
           trx_stats(percent_of="av_anniv"))

trx_res.data.glimpse()
Rows: 60
Columns: 15
$ pol_yr                 <u32> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3
$ inc_guar              <bool> False, False, True, True, False, False, True, True, False, False
$ trx_type               <str> 'Base', 'Rider', 'Base', 'Rider', 'Base', 'Rider', 'Base', 'Rider', 'Base', 'Rider'
$ trx_n                  <f64> 6077.0, 0.0, 1370.0, 8077.0, 6091.0, 0.0, 1183.0, 8232.0, 6016.0, 0.0
$ trx_flag               <u32> 2881, 0, 633, 3778, 2863, 0, 559, 3834, 2813, 0
$ trx_amt                <f64> 98287.0, 0.0, 21590.0, 265312.0, 98413.0, 0.0, 18554.0, 288114.0, 97285.0, 0.0
$ exposure               <f64> 7435.0, 7435.0, 11106.0, 11106.0, 6813.0, 6813.0, 10158.0, 10158.0, 6176.0, 6176.0
$ av_anniv               <f64> 9686914.0, 9686914.0, 14679001.0, 14679001.0, 9218561.0, 9218561.0, 13795383.0, 13795383.0, 8682868.0, 8682868.0
$ av_anniv_w_trx         <f64> 3875306.0, 0.0, 865046.0, 4982082.0, 3909786.0, 0.0, 797932.0, 5022297.0, 3954457.0, 0.0
$ avg_trx                <f64> 34.11558486636584, nan, 34.107424960505526, 70.22551614610906, 34.37408312958435, nan, 33.19141323792486, 75.1471048513302, 34.584073942410235, nan
$ avg_all                <f64> 13.219502353732347, 0.0, 1.9439942373491805, 23.889068971726996, 14.444884779098782, 0.0, 1.8265406576097658, 28.36326048434731, 15.752104922279793, 0.0
$ trx_freq               <f64> 2.109337035751475, nan, 2.1642969984202214, 2.13790365272631, 2.1274886482710444, nan, 2.116279069767442, 2.1471048513302033, 2.1386420191965874, nan
$ trx_util               <f64> 0.3874915938130464, 0.0, 0.056996218260399786, 0.3401764811813434, 0.4202260384558931, 0.0, 0.055030517818468204, 0.377436503248671, 0.45547279792746115, 0.0
$ pct_of_av_anniv_all    <f64> 0.01014636859581906, 0.0, 0.0014708085379924697, 0.018074254508191667, 0.010675527340980876, 0.0, 0.0013449427246782493, 0.020884813419098258, 0.011204247260237055, 0.0
$ pct_of_av_anniv_w_trx  <f64> 0.025362384286556985, nan, 0.024958210314827187, 0.05325323830478904, 0.025170942859788235, nan, 0.023252607991658437, 0.057366977699646195, 0.024601354876282636, nan

Confidence intervals

If conf_int is set to True, trx_stats() will produce lower and upper confidence interval limits for the observed utilization rate. Confidence intervals are constructed assuming a binomial distribution.

(exposed_data.
    group_by('pol_yr').
    trx_stats(conf_int=True).
    data.select('pol_yr', pl.col('^trx_util.*$')))
shape: (30, 4)
pol_yr trx_util trx_util_lower trx_util_upper
u32 f64 f64 f64
1 0.189526 0.183917 0.195189
1 0.203765 0.197994 0.20959
2 0.201638 0.195628 0.207707
2 0.225915 0.21961 0.23222
3 0.214912 0.208417 0.221407
13 0.516784 0.489461 0.544106
14 0.245645 0.210801 0.280488
14 0.550523 0.510453 0.590592
15 0.380952 0.190476 0.571429
15 0.428571 0.238095 0.619048

The default confidence level is 95%. This can be changed using the conf_level argument. Below, tighter confidence intervals are constructed by decreasing the confidence level to 90%.

(exposed_data.
    group_by('pol_yr').
    trx_stats(conf_int=True, conf_level=0.9).
    data.select('pol_yr', pl.col('^trx_util.*$')))
shape: (30, 4)
pol_yr trx_util trx_util_lower trx_util_upper
u32 f64 f64 f64
1 0.189526 0.18478 0.194272
1 0.203765 0.198911 0.208619
2 0.201638 0.196571 0.206706
2 0.225915 0.220671 0.231218
3 0.214912 0.209456 0.220368
13 0.516784 0.494145 0.539422
14 0.245645 0.216028 0.275261
14 0.550523 0.515679 0.585366
15 0.380952 0.190476 0.571429
15 0.428571 0.238095 0.619048

If any column names are passed to percent_of, trx_stats() will produce additional confidence intervals:

  • Intervals for transactions as a percentage of another column when transactions occur (pct_of_{*}_w_trx) are constructed using a normal distribution.
  • Intervals for transactions as a percentage of another column regardless of transaction utilization (pct_of_{*}_all) are calculated assuming that the aggregate distribution is normal with a mean equal to observed transactions and a variance equal to:

\[ Var(S) = E(N) \times Var(X) + E(X)^2 \times Var(N), \] Where S is the aggregate transactions random variable, X is an individual transaction amount assumed to follow a normal distribution, and N is a binomial random variable for transaction utilization.

(exposed_data.
    group_by('pol_yr').
    trx_stats(conf_int=True, percent_of="av_anniv").
    data.select('pol_yr', pl.col('^pct_of.*$')).
    glimpse())
Rows: 30
Columns: 7
$ pol_yr                      <u32> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5
$ pct_of_av_anniv_all         <f64> 0.004919864491031837, 0.010888653268305336, 0.005082440454361061, 0.012519105808200455, 0.0054184028999354535, 0.01372773518470287, 0.0057982676649823905, 0.014308868197434476, 0.006049953867711983, 0.01458240440835664
$ pct_of_av_anniv_w_trx       <f64> 0.02528862835502511, 0.05325323830478904, 0.024845795776212595, 0.057366977699646195, 0.025090409528046198, 0.058707111268391836, 0.02573072327170688, 0.058632474394021104, 0.025600201973245067, 0.05745971742829165
$ pct_of_av_anniv_w_trx_lower <f64> 0.024019602566381235, 0.05078956041146959, 0.02362582577122162, 0.05443418723085425, 0.02388004600357687, 0.055946842712423055, 0.024241205873009836, 0.05572760267518904, 0.024227755075207042, 0.05431596122849967
$ pct_of_av_anniv_w_trx_upper <f64> 0.02655765414366899, 0.055716916198108486, 0.02606576578120357, 0.06029976816843815, 0.02630077305251553, 0.061467379824360624, 0.02722024067040392, 0.06153734611285317, 0.02697264887128309, 0.06060347362808364
$ pct_of_av_anniv_all_lower   <f64> 0.004632812409391607, 0.010297256498243708, 0.004790158298104324, 0.011790284670673416, 0.00511005224674474, 0.012979911178210493, 0.005417177353585797, 0.013498165909945252, 0.005671678443875981, 0.01368698584634903
$ pct_of_av_anniv_all_upper   <f64> 0.0052069165726720676, 0.011480050038366964, 0.005374722610617797, 0.013247926945727493, 0.0057267535531261675, 0.014475559191195248, 0.006179357976378984, 0.0151195704849237, 0.006428229291547985, 0.01547782297036425

plot() and table()

The plot() and table() methods create visualizations and summary tables from TrxStats objects. See See Visualizations for full details on these functions.

trx_res.plot(y='pct_of_av_anniv_w_trx')

<Figure Size: (640 x 480)>
# first 10 rows showed for brevity
trx_res.table()

Miscellaneous

Selecting and combining transaction types

The trx_types argument of trx_stats() selects a subset of transaction types that will appear in the output.

exposed_data.trx_stats(trx_types="Base")
Transaction study results

Groups: pol_yr
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base

shape: (15, 10)
┌────────┬──────────┬────────┬──────────┬───┬───────────┬───────────┬──────────┬──────────┐
│ pol_yr ┆ trx_type ┆ trx_n  ┆ trx_flag ┆ … ┆ avg_trx   ┆ avg_all   ┆ trx_freq ┆ trx_util │
│ ---    ┆ ---      ┆ ---    ┆ ---      ┆   ┆ ---       ┆ ---       ┆ ---      ┆ ---      │
│ u32    ┆ str      ┆ f64    ┆ u32      ┆   ┆ f64       ┆ f64       ┆ f64      ┆ f64      │
╞════════╪══════════╪════════╪══════════╪═══╪═══════════╪═══════════╪══════════╪══════════╡
│ 1      ┆ Base     ┆ 7447.0 ┆ 3514     ┆ … ┆ 34.114115 ┆ 6.465509  ┆ 2.119237 ┆ 0.189526 │
│ 2      ┆ Base     ┆ 7274.0 ┆ 3422     ┆ … ┆ 34.180888 ┆ 6.892169  ┆ 2.125658 ┆ 0.201638 │
│ 3      ┆ Base     ┆ 7061.0 ┆ 3309     ┆ … ┆ 35.163796 ┆ 7.557122  ┆ 2.133877 ┆ 0.214912 │
│ 4      ┆ Base     ┆ 6596.0 ┆ 3080     ┆ … ┆ 37.333442 ┆ 8.338434  ┆ 2.141558 ┆ 0.22335  │
│ 5      ┆ Base     ┆ 6093.0 ┆ 2847     ┆ … ┆ 38.60836  ┆ 8.984633  ┆ 2.140148 ┆ 0.232712 │
│ …      ┆ …        ┆ …      ┆ …        ┆ … ┆ …         ┆ …         ┆ …        ┆ …        │
│ 11     ┆ Base     ┆ 2428.0 ┆ 1115     ┆ … ┆ 48.259193 ┆ 13.804259 ┆ 2.177578 ┆ 0.286044 │
│ 12     ┆ Base     ┆ 1320.0 ┆ 605      ┆ … ┆ 50.289256 ┆ 13.414903 ┆ 2.181818 ┆ 0.266755 │
│ 13     ┆ Base     ┆ 700.0  ┆ 319      ┆ … ┆ 55.827586 ┆ 13.90242  ┆ 2.194357 ┆ 0.249024 │
│ 14     ┆ Base     ┆ 315.0  ┆ 141      ┆ … ┆ 63.0      ┆ 15.47561  ┆ 2.234043 ┆ 0.245645 │
│ 15     ┆ Base     ┆ 19.0   ┆ 8        ┆ … ┆ 28.125    ┆ 10.714286 ┆ 2.375    ┆ 0.380952 │
└────────┴──────────┴────────┴──────────┴───┴───────────┴───────────┴──────────┴──────────┘

If the combine_trx argument is set to True, all transaction types will be combined in a group called “All” in the output.

exposed_data.trx_stats(combine_trx=True)
Transaction study results

Groups: pol_yr
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (15, 10)
┌────────┬──────────┬─────────┬──────────┬───┬───────────┬───────────┬──────────┬──────────┐
│ pol_yr ┆ trx_type ┆ trx_n   ┆ trx_flag ┆ … ┆ avg_trx   ┆ avg_all   ┆ trx_freq ┆ trx_util │
│ ---    ┆ ---      ┆ ---     ┆ ---      ┆   ┆ ---       ┆ ---       ┆ ---      ┆ ---      │
│ u32    ┆ str      ┆ f64     ┆ u32      ┆   ┆ f64       ┆ f64       ┆ f64      ┆ f64      │
╞════════╪══════════╪═════════╪══════════╪═══╪═══════════╪═══════════╪══════════╪══════════╡
│ 1      ┆ All      ┆ 15524.0 ┆ 7292     ┆ … ┆ 52.823505 ┆ 20.774985 ┆ 2.128908 ┆ 0.393291 │
│ 2      ┆ All      ┆ 15506.0 ┆ 7256     ┆ … ┆ 55.82704  ┆ 23.869012 ┆ 2.13699  ┆ 0.427553 │
│ 3      ┆ All      ┆ 15265.0 ┆ 7126     ┆ … ┆ 57.697446 ┆ 26.703384 ┆ 2.142155 ┆ 0.462817 │
│ 4      ┆ All      ┆ 14556.0 ┆ 6795     ┆ … ┆ 58.682855 ┆ 28.915881 ┆ 2.142163 ┆ 0.492748 │
│ 5      ┆ All      ┆ 13629.0 ┆ 6368     ┆ … ┆ 58.865735 ┆ 30.640592 ┆ 2.140232 ┆ 0.520517 │
│ …      ┆ …        ┆ …       ┆ …        ┆ … ┆ …         ┆ …         ┆ …        ┆ …        │
│ 11     ┆ All      ┆ 5896.0  ┆ 2712     ┆ … ┆ 71.323378 ┆ 49.622627 ┆ 2.174041 ┆ 0.695741 │
│ 12     ┆ All      ┆ 3718.0  ┆ 1681     ┆ … ┆ 71.917906 ┆ 53.304233 ┆ 2.211779 ┆ 0.741182 │
│ 13     ┆ All      ┆ 2186.0  ┆ 981      ┆ … ┆ 82.251784 ┆ 62.989071 ┆ 2.228338 ┆ 0.765808 │
│ 14     ┆ All      ┆ 1051.0  ┆ 457      ┆ … ┆ 86.501094 ┆ 68.869338 ┆ 2.299781 ┆ 0.796167 │
│ 15     ┆ All      ┆ 38.0    ┆ 17       ┆ … ┆ 68.764706 ┆ 55.666667 ┆ 2.235294 ┆ 0.809524 │
└────────┴──────────┴─────────┴──────────┴───┴───────────┴───────────┴──────────┴──────────┘

Partial exposures are removed as a default

As a default, trx_stats() removes partial exposures before summarizing results. This is done to avoid complexity associated with a lopsided skew in the timing of transactions. For example, if transactions can occur on a monthly basis or annually at the beginning of each policy year, partial exposures may not be appropriate. If a policy had an exposure of 0.5 years and was taking withdrawals annually at the beginning of the year, an argument could be made that the exposure should instead be 1 complete year. If the same policy was expected to take withdrawals 9 months into the year, it’s not clear if the exposure should be 0.5 years or 0.5 / 0.75 years. To override this treatment, set the full_exposures_only argument to False.

exposed_data.trx_stats(full_exposures_only=False)
Transaction study results

Groups: pol_yr
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider

shape: (30, 10)
┌────────┬──────────┬────────┬──────────┬───┬────────────┬────────────┬──────────┬──────────┐
│ pol_yr ┆ trx_type ┆ trx_n  ┆ trx_flag ┆ … ┆ avg_trx    ┆ avg_all    ┆ trx_freq ┆ trx_util │
│ ---    ┆ ---      ┆ ---    ┆ ---      ┆   ┆ ---        ┆ ---        ┆ ---      ┆ ---      │
│ u32    ┆ str      ┆ f64    ┆ u32      ┆   ┆ f64        ┆ f64        ┆ f64      ┆ f64      │
╞════════╪══════════╪════════╪══════════╪═══╪════════════╪════════════╪══════════╪══════════╡
│ 1      ┆ Base     ┆ 8123.0 ┆ 3818     ┆ … ┆ 33.82792   ┆ 6.708581   ┆ 2.127554 ┆ 0.198315 │
│ 1      ┆ Rider    ┆ 8819.0 ┆ 4103     ┆ … ┆ 70.828175  ┆ 15.094788  ┆ 2.149403 ┆ 0.213118 │
│ 2      ┆ Base     ┆ 7924.0 ┆ 3737     ┆ … ┆ 34.321113  ┆ 7.240174   ┆ 2.120417 ┆ 0.210954 │
│ 2      ┆ Rider    ┆ 8976.0 ┆ 4186     ┆ … ┆ 74.692069  ┆ 17.649739  ┆ 2.14429  ┆ 0.2363   │
│ 3      ┆ Base     ┆ 7668.0 ┆ 3603     ┆ … ┆ 35.304468  ┆ 7.902154   ┆ 2.128226 ┆ 0.223829 │
│ …      ┆ …        ┆ …      ┆ …        ┆ … ┆ …          ┆ …          ┆ …        ┆ …        │
│ 13     ┆ Rider    ┆ 2285.0 ┆ 1026     ┆ … ┆ 91.708577  ┆ 58.085349  ┆ 2.227096 ┆ 0.633369 │
│ 14     ┆ Base     ┆ 622.0  ┆ 283      ┆ … ┆ 60.55477   ┆ 19.656158  ┆ 2.19788  ┆ 0.324601 │
│ 14     ┆ Rider    ┆ 1497.0 ┆ 663      ┆ … ┆ 102.467572 ┆ 77.922667  ┆ 2.257919 ┆ 0.760462 │
│ 15     ┆ Base     ┆ 277.0  ┆ 121      ┆ … ┆ 75.280992  ┆ 33.966619  ┆ 2.289256 ┆ 0.451198 │
│ 15     ┆ Rider    ┆ 705.0  ┆ 309      ┆ … ┆ 101.032362 ┆ 116.412765 ┆ 2.281553 ┆ 1.152232 │
└────────┴──────────┴────────┴──────────┴───┴────────────┴────────────┴──────────┴──────────┘

Summary method

As noted above, the result of trx_stats() is a TrxStats object. If the summary() function is applied to a TrxStats object, the data will be summarized again and return a higher level TrxStats object.

If no additional arguments are passed, summary() returns a single row of aggregate results for each transaction type.

trx_res.summary()
Transaction study results

Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider
Transactions as % of: av_anniv

shape: (2, 13)
┌──────────┬─────────┬──────────┬────────────┬───┬──────────┬──────────┬─────────────┬─────────────┐
│ trx_type ┆ trx_n   ┆ trx_flag ┆ trx_amt    ┆ … ┆ trx_freq ┆ trx_util ┆ pct_of_av_a ┆ pct_of_av_a │
│ ---      ┆ ---     ┆ ---      ┆ ---        ┆   ┆ ---      ┆ ---      ┆ nniv_all    ┆ nniv_w_trx  │
│ str      ┆ f64     ┆ u32      ┆ f64        ┆   ┆ f64      ┆ f64      ┆ ---         ┆ ---         │
│          ┆         ┆          ┆            ┆   ┆          ┆          ┆ f64         ┆ f64         │
╞══════════╪═════════╪══════════╪════════════╪═══╪══════════╪══════════╪═════════════╪═════════════╡
│ Base     ┆ 60500.0 ┆ 28224    ┆ 1.093899e6 ┆ … ┆ 2.143566 ┆ 0.227296 ┆ 0.005943    ┆ 0.025139    │
│ Rider    ┆ 77321.0 ┆ 35941    ┆ 2.842729e6 ┆ … ┆ 2.151331 ┆ 0.289443 ┆ 0.015444    ┆ 0.059616    │
└──────────┴─────────┴──────────┴────────────┴───┴──────────┴──────────┴─────────────┴─────────────┘

If additional variable names are passed to the summary() function, then the output will group the data by those variables. In our example, if pol_yr is passed to summary(), the output will contain one row per policy year for each transaction type.

trx_res.summary('pol_yr')
Transaction study results

Groups: pol_yr
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider
Transactions as % of: av_anniv

shape: (30, 14)
┌────────┬──────────┬────────┬──────────┬───┬──────────┬──────────┬────────────────┬───────────────┐
│ pol_yr ┆ trx_type ┆ trx_n  ┆ trx_flag ┆ … ┆ trx_freq ┆ trx_util ┆ pct_of_av_anni ┆ pct_of_av_ann │
│ ---    ┆ ---      ┆ ---    ┆ ---      ┆   ┆ ---      ┆ ---      ┆ v_all          ┆ iv_w_trx      │
│ u32    ┆ str      ┆ f64    ┆ u32      ┆   ┆ f64      ┆ f64      ┆ ---            ┆ ---           │
│        ┆          ┆        ┆          ┆   ┆          ┆          ┆ f64            ┆ f64           │
╞════════╪══════════╪════════╪══════════╪═══╪══════════╪══════════╪════════════════╪═══════════════╡
│ 1      ┆ Base     ┆ 7447.0 ┆ 3514     ┆ … ┆ 2.119237 ┆ 0.189526 ┆ 0.00492        ┆ 0.025289      │
│ 1      ┆ Rider    ┆ 8077.0 ┆ 3778     ┆ … ┆ 2.137904 ┆ 0.203765 ┆ 0.010889       ┆ 0.053253      │
│ 2      ┆ Base     ┆ 7274.0 ┆ 3422     ┆ … ┆ 2.125658 ┆ 0.201638 ┆ 0.005082       ┆ 0.024846      │
│ 2      ┆ Rider    ┆ 8232.0 ┆ 3834     ┆ … ┆ 2.147105 ┆ 0.225915 ┆ 0.012519       ┆ 0.057367      │
│ 3      ┆ Base     ┆ 7061.0 ┆ 3309     ┆ … ┆ 2.133877 ┆ 0.214912 ┆ 0.005418       ┆ 0.02509       │
│ …      ┆ …        ┆ …      ┆ …        ┆ … ┆ …        ┆ …        ┆ …              ┆ …             │
│ 13     ┆ Rider    ┆ 1486.0 ┆ 662      ┆ … ┆ 2.244713 ┆ 0.516784 ┆ 0.025157       ┆ 0.064411      │
│ 14     ┆ Base     ┆ 315.0  ┆ 141      ┆ … ┆ 2.234043 ┆ 0.245645 ┆ 0.007851       ┆ 0.024588      │
│ 14     ┆ Rider    ┆ 736.0  ┆ 316      ┆ … ┆ 2.329114 ┆ 0.550523 ┆ 0.027088       ┆ 0.068         │
│ 15     ┆ Base     ┆ 19.0   ┆ 8        ┆ … ┆ 2.375    ┆ 0.380952 ┆ 0.00725        ┆ 0.02034       │
│ 15     ┆ Rider    ┆ 19.0   ┆ 9        ┆ … ┆ 2.111111 ┆ 0.428571 ┆ 0.030418       ┆ 0.078347      │
└────────┴──────────┴────────┴──────────┴───┴──────────┴──────────┴────────────────┴───────────────┘

Similarly, if inc_guar is passed to summary(), the output will contain a row for each transaction type and unique value in inc_guar.

trx_res.summary('inc_guar')
Transaction study results

Groups: inc_guar
Study range: 1900-01-01 to 2019-12-31
Transaction types: Base, Rider
Transactions as % of: av_anniv

shape: (4, 14)
┌──────────┬──────────┬─────────┬──────────┬───┬──────────┬──────────┬──────────────┬──────────────┐
│ inc_guar ┆ trx_type ┆ trx_n   ┆ trx_flag ┆ … ┆ trx_freq ┆ trx_util ┆ pct_of_av_an ┆ pct_of_av_an │
│ ---      ┆ ---      ┆ ---     ┆ ---      ┆   ┆ ---      ┆ ---      ┆ niv_all      ┆ niv_w_trx    │
│ bool     ┆ str      ┆ f64     ┆ u32      ┆   ┆ f64      ┆ f64      ┆ ---          ┆ ---          │
│          ┆          ┆         ┆          ┆   ┆          ┆          ┆ f64          ┆ f64          │
╞══════════╪══════════╪═════════╪══════════╪═══╪══════════╪══════════╪══════════════╪══════════════╡
│ false    ┆ Base     ┆ 52939.0 ┆ 24703    ┆ … ┆ 2.143019 ┆ 0.504782 ┆ 0.012757     ┆ 0.02511      │
│ false    ┆ Rider    ┆ 0.0     ┆ 0        ┆ … ┆ NaN      ┆ 0.0      ┆ 0.0          ┆ NaN          │
│ true     ┆ Base     ┆ 7561.0  ┆ 3521     ┆ … ┆ 2.147401 ┆ 0.0468   ┆ 0.001291     ┆ 0.025336     │
│ true     ┆ Rider    ┆ 77321.0 ┆ 35941    ┆ … ┆ 2.151331 ┆ 0.477716 ┆ 0.025987     ┆ 0.059616     │
└──────────┴──────────┴─────────┴──────────┴───┴──────────┴──────────┴──────────────┴──────────────┘

Column names

As a default, add_transactions() assumes the transaction data frame (trx_data) uses the following naming conventions:

  • The policy number column is called pol_num
  • The transaction date column is called trx_date
  • The transaction type column is called trx_type
  • The transaction amount column is called trx_amt

These default names can be overridden using the col_pol_num, col_trx_date, col_trx_type, and col_trx_amt arguments.

For example, if the transaction type column was called transaction_code in our data, we could write:

exposed_data.add_transactions(withdrawals, col_trx_type="transaction_code")

Similarly, trx_stats() assumes the input data uses the name exposure for exposures. This default can be overridden using the argument col_exposure.

Limitations

The trx_stats() function does not produce any calculations related to the persistence of transactions from exposure period to exposure period.

Footnotes

  1. See Exposures for more information on creating ExposedDF objects.↩︎