Title: | Create Actuarial Experience Studies: Prepare Data, Summarize Results, and Create Reports |
---|---|
Description: | Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry (2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the 'exp_stats()' function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory". |
Authors: | Matt Heaphy [aut, cre] |
Maintainer: | Matt Heaphy <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.5.0 |
Built: | 2024-11-21 03:51:26 UTC |
Source: | https://github.com/mattheaphy/actxps |
Attach predicted values from a model to a data frame with exposure-level records.
add_predictions(.data, model, ..., col_expected = NULL)
add_predictions(.data, model, ..., col_expected = NULL)
.data |
A data frame, preferably with the class |
model |
A model object that has an S3 method for |
... |
Additional arguments passed to |
col_expected |
|
This function attaches predictions from a model to a data frame
that preferably has the class exposed_df
. The model
argument must be
a model object that has an S3 method for the predict()
function. This
method must have new data for predictions as the second argument.
The col_expected
argument is optional.
If NULL
, names from the result of predict()
will be used. If there are
no names, a default name of "expected" is assumed. In the event that
predict()
returns multiple values, the default name will be suffixed by
"_x", where x = 1 to the number of values returned.
If a value is passed, it must be a character vector of same length as
the result of predict()
A data frame or exposed_df
object with one of more new columns
containing predictions.
expo <- expose_py(census_dat, "2019-12-31") |> mutate(surrender = status == "Surrender") mod <- glm(surrender ~ inc_guar + pol_yr, expo, family = 'binomial') add_predictions(expo, mod, type = 'response')
expo <- expose_py(census_dat, "2019-12-31") |> mutate(surrender = status == "Surrender") mod <- glm(surrender ~ inc_guar + pol_yr, expo, family = 'binomial') add_predictions(expo, mod, type = 'response')
Attach summarized transactions to a data frame with exposure-level records.
add_transactions( .data, trx_data, col_pol_num = "pol_num", col_trx_date = "trx_date", col_trx_type = "trx_type", col_trx_amt = "trx_amt" )
add_transactions( .data, trx_data, col_pol_num = "pol_num", col_trx_date = "trx_date", col_trx_type = "trx_type", col_trx_amt = "trx_amt" )
.data |
A data frame with exposure-level records with the class
|
trx_data |
A data frame containing transactions details. This data frame must have columns for policy numbers, transaction dates, transaction types, and transaction amounts. |
col_pol_num |
Name of the column in |
col_trx_date |
Name of the column in |
col_trx_type |
Name of the column in |
col_trx_amt |
Name of the column in |
This function attaches transactions to an exposed_df
object.
Transactions are grouped and summarized such that the number of rows in
the exposed_df
object does not change. Two columns are added to the output
for each transaction type. These columns have names of the pattern
trx_n_{*}
(transaction counts) and trx_amt_{*}
(transaction_amounts).
Transactions are associated with the exposed_df
object by matching
transactions dates with exposure dates ranges found in exposed_df
.
All columns containing dates must be in YYYY-MM-DD format.
An exposed_df
object with two new columns containing transaction
counts and amounts for each transaction type found in trx_data
. The
exposed_df
's trx_types
attributes will be updated to include the new
transaction types found in trx_data.
expo <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") add_transactions(expo, withdrawals)
expo <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") add_transactions(expo, withdrawals)
A pre-aggregated version of surrender and withdrawal experience from the
simulated data sets census_dat
, withdrawals
, and account_vals
. This
data is theoretical only and does not represent the experience on any
specific product.
agg_sim_dat
agg_sim_dat
A data frame containing summarized experience study results grouped by policy year, income guarantee presence, tax-qualified status, and product.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 180 rows and 16 columns.
Policy year
Indicates whether the policy was issued with an income guarantee
Indicates whether the policy was purchased with tax-qualified funds
Product: a, b, or c
Sum of policy year exposures by count
Sum of claim counts
Sum of account value
Sum of policy year exposures weighted by account value
Sum of claims weighted by account value
Sum of squared account values
Number of exposure records
Sum of partial withdrawal transactions
Count of partial withdrawal transactions
Count of exposure records with partial withdrawal transactions
Sum of squared partial withdrawal transactions
Sum of account value for exposure records with partial withdrawal transactions
Convert aggregate termination experience studies to the exp_df
class.
as_exp_df( x, expected = NULL, wt = NULL, col_claims, col_exposure, col_n_claims, col_weight_sq, col_weight_n, target_status = NULL, start_date = as.Date("1900-01-01"), end_date = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE ) is_exp_df(x)
as_exp_df( x, expected = NULL, wt = NULL, col_claims, col_exposure, col_n_claims, col_weight_sq, col_weight_n, target_status = NULL, start_date = as.Date("1900-01-01"), end_date = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE ) is_exp_df(x)
x |
An object. For |
expected |
A character vector containing column names in x with expected values |
wt |
Optional. Length 1 character vector. Name of the column in |
col_claims |
Optional. Name of the column in |
col_exposure |
Optional. Name of the column in |
col_n_claims |
Optional and only used used when |
col_weight_sq |
Optional and only used used when |
col_weight_n |
Optional and only used used when |
target_status |
Character vector of target status values. Default value
= |
start_date |
Experience study start date. Default value = 1900-01-01. |
end_date |
Experience study end date |
credibility |
If |
conf_level |
Confidence level used for the Limited Fluctuation credibility method and confidence intervals |
cred_r |
Error tolerance under the Limited Fluctuation credibility method |
conf_int |
If |
is_exp_df()
will return TRUE
if x
is an exp_df
object.
as_exp_df()
will coerce a data frame to an exp_df
object if that
data frame has columns for exposures and claims.
as_exp_df()
is most useful for working with aggregate summaries of
experience that were not created by actxps where individual policy
information is not available. After converting the data to the exp_df
class, summary()
can be used to summarize data by any grouping variables,
and autoplot()
and autotable()
are available for reporting.
If nothing is passed to wt
, the data frame x
must include columns
containing:
Exposures (exposure
)
Claim counts (claims
)
If wt
is passed, the data must include columns containing:
Weighted exposures (exposure
)
Weighted claims (claims
)
Claim counts (n_claims
)
The raw sum of weights NOT multiplied by exposures
Exposure record counts (.weight_n
)
The raw sum of squared weights (.weight_sq
)
The names in parentheses above are expected column names. If the data
frame passed to as_exp_df()
uses different column names, these can be
specified using the col_*
arguments.
When a column name is passed to wt
, the columns .weight
, .weight_n
,
and .weight_sq
are used to calculate credibility and confidence intervals.
If credibility and confidence intervals aren't required, then it is not
necessary to pass anything to wt
. The results of as_exp_df()
and any
downstream summaries will still be weighted as long as the exposures and
claims are pre-weighted.
target_status
, start_date
, and end_date
are optional arguments that are
only used for printing the resulting exp_df
object.
For is_exp_df()
, a length-1 logical vector. For as_exp_df()
,
an exp_df
object.
exp_stats()
for information on how exp_df
objects are typically
created from individual exposure records.
# convert pre-aggregated experience into an exp_df object dat <- as_exp_df(agg_sim_dat, col_exposure = "exposure_n", col_claims = "claims_n", target_status = "Surrender", start_date = 2005, end_date = 2019, conf_int = TRUE) dat is_exp_df(dat) # summary by policy year summary(dat, pol_yr) # repeat the prior exercise on a weighted basis dat_wt <- as_exp_df(agg_sim_dat, wt = "av", col_exposure = "exposure_amt", col_claims = "claims_amt", col_n_claims = "claims_n", col_weight_sq = "av_sq", col_weight_n = "n", target_status = "Surrender", start_date = 2005, end_date = 2019, conf_int = TRUE) dat_wt # summary by policy year summary(dat_wt, pol_yr)
# convert pre-aggregated experience into an exp_df object dat <- as_exp_df(agg_sim_dat, col_exposure = "exposure_n", col_claims = "claims_n", target_status = "Surrender", start_date = 2005, end_date = 2019, conf_int = TRUE) dat is_exp_df(dat) # summary by policy year summary(dat, pol_yr) # repeat the prior exercise on a weighted basis dat_wt <- as_exp_df(agg_sim_dat, wt = "av", col_exposure = "exposure_amt", col_claims = "claims_amt", col_n_claims = "claims_n", col_weight_sq = "av_sq", col_weight_n = "n", target_status = "Surrender", start_date = 2005, end_date = 2019, conf_int = TRUE) dat_wt # summary by policy year summary(dat_wt, pol_yr)
Convert aggregate transaction experience studies to the trx_df
class.
as_trx_df( x, col_trx_amt = "trx_amt", col_trx_n = "trx_n", col_trx_flag = "trx_flag", col_exposure = "exposure", col_percent_of = NULL, col_percent_of_w_trx = NULL, col_trx_amt_sq = "trx_amt_sq", start_date = as.Date("1900-01-01"), end_date = NULL, conf_int = FALSE, conf_level = 0.95 ) is_trx_df(x)
as_trx_df( x, col_trx_amt = "trx_amt", col_trx_n = "trx_n", col_trx_flag = "trx_flag", col_exposure = "exposure", col_percent_of = NULL, col_percent_of_w_trx = NULL, col_trx_amt_sq = "trx_amt_sq", start_date = as.Date("1900-01-01"), end_date = NULL, conf_int = FALSE, conf_level = 0.95 ) is_trx_df(x)
x |
An object. For |
col_trx_amt |
Optional. Name of the column in |
col_trx_n |
Optional. Name of the column in |
col_trx_flag |
Optional. Name of the column in |
col_exposure |
Optional. Name of the column in |
col_percent_of |
Optional. Name of the column in |
col_percent_of_w_trx |
Optional. Name of the column in |
col_trx_amt_sq |
Optional and only required when |
start_date |
Experience study start date. Default value = 1900-01-01. |
end_date |
Experience study end date |
conf_int |
If |
conf_level |
Confidence level for confidence intervals |
is_trx_df()
will return TRUE
if x
is a trx_df
object.
as_trx_df()
will coerce a data frame to a trx_df
object if that
data frame has the required columns for transaction studies listed below.
as_trx_df()
is most useful for working with aggregate summaries of
experience that were not created by actxps where individual policy
information is not available. After converting the data to the trx_df
class, summary()
can be used to summarize data by any grouping variables,
and autoplot()
and autotable()
are available for reporting.
At a minimum, the following columns are required:
Transaction amounts (trx_amt
)
Transaction counts (trx_n
)
The number of exposure records with transactions (trx_flag
). This number
is not necessarily equal to transaction counts. If multiple transactions
are allowed per exposure period, trx_flag
will be less than trx_n
.
Exposures (exposure
)
If transaction amounts should be expressed as a percentage of another variable (i.e. to calculate utilization rates or actual-to-expected ratios), additional columns are required:
A denominator "percent of" column. For example, the sum of account values.
A denominator "percent of" column for exposure records with transactions. For example, the sum of account values across all records with non-zero transaction amounts.
If confidence intervals are desired and "percent of" columns are passed, an
additional column for the sum of squared transaction amounts (trx_amt_sq
)
is also required.
The names in parentheses above are expected column names. If the data
frame passed to as_trx_df()
uses different column names, these can be
specified using the col_*
arguments.
start_date
, and end_date
are optional arguments that are
only used for printing the resulting trx_df
object.
Unlike trx_stats()
, as_trx_df()
only permits a single transaction type and
a single percent_of
column.
For is_trx_df()
, a length-1 logical vector. For as_trx_df()
,
a trx_df
object.
trx_stats()
for information on how trx_df
objects are typically
created from individual exposure records.
# convert pre-aggregated experience into a trx_df object dat <- as_trx_df(agg_sim_dat, col_exposure = "n", col_trx_amt = "wd", col_trx_n = "wd_n", col_trx_flag = "wd_flag", col_percent_of = "av", col_percent_of_w_trx = "av_w_wd", col_trx_amt_sq = "wd_sq", start_date = 2005, end_date = 2019, conf_int = TRUE) dat is_trx_df(dat) # summary by policy year summary(dat, pol_yr)
# convert pre-aggregated experience into a trx_df object dat <- as_trx_df(agg_sim_dat, col_exposure = "n", col_trx_amt = "wd", col_trx_n = "wd_n", col_trx_flag = "wd_flag", col_percent_of = "av", col_percent_of_w_trx = "av_w_wd", col_trx_amt_sq = "wd_sq", start_date = 2005, end_date = 2019, conf_int = TRUE) dat is_trx_df(dat) # summary by policy year summary(dat, pol_yr)
Plot experience study results
## S3 method for class 'exp_df' autoplot( object, ..., x = NULL, y = NULL, color = NULL, mapping, second_axis = FALSE, second_y = NULL, scales = "fixed", geoms = c("lines", "bars", "points"), y_labels = scales::label_percent(accuracy = 0.1), second_y_labels = scales::label_comma(accuracy = 1), y_log10 = FALSE, conf_int_bars = FALSE ) ## S3 method for class 'trx_df' autoplot( object, ..., x = NULL, y = NULL, color = NULL, mapping, second_axis = FALSE, second_y = NULL, scales = "fixed", geoms = c("lines", "bars", "points"), y_labels = scales::label_percent(accuracy = 0.1), second_y_labels = scales::label_comma(accuracy = 1), y_log10 = FALSE, conf_int_bars = FALSE )
## S3 method for class 'exp_df' autoplot( object, ..., x = NULL, y = NULL, color = NULL, mapping, second_axis = FALSE, second_y = NULL, scales = "fixed", geoms = c("lines", "bars", "points"), y_labels = scales::label_percent(accuracy = 0.1), second_y_labels = scales::label_comma(accuracy = 1), y_log10 = FALSE, conf_int_bars = FALSE ) ## S3 method for class 'trx_df' autoplot( object, ..., x = NULL, y = NULL, color = NULL, mapping, second_axis = FALSE, second_y = NULL, scales = "fixed", geoms = c("lines", "bars", "points"), y_labels = scales::label_percent(accuracy = 0.1), second_y_labels = scales::label_comma(accuracy = 1), y_log10 = FALSE, conf_int_bars = FALSE )
object |
An object of class |
... |
Faceting variables passed to |
x |
An unquoted column name in |
y |
An unquoted column name in |
color |
An unquoted column name in |
mapping |
Aesthetic mapping passed to |
second_axis |
Logical. If |
second_y |
An unquoted column name in |
scales |
The |
geoms |
Type of geometry. If "lines" is passed, the plot will display lines and points. If "bars", the plot will display bars. If "points", the plot will display points only. |
y_labels |
Label function passed to |
second_y_labels |
Same as |
y_log10 |
If |
conf_int_bars |
If |
If no aesthetic map is supplied, the plot will use the first
grouping variable in object
on the x axis and q_obs
on the y
axis. In addition, the second grouping variable in object
will be
used for color and fill.
If no faceting variables are supplied, the plot will use grouping
variables 3 and up as facets. These variables are passed into
ggplot2::facet_wrap()
. Specific to trx_df
objects, transaction
type (trx_type
) will also be added as a faceting variable.
a ggplot
object
plot_termination_rates()
, plot_actual_to_expected()
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") study_py <- study_py |> add_transactions(withdrawals) exp_res <- study_py |> group_by(pol_yr) |> exp_stats() autoplot(exp_res) trx_res <- study_py |> group_by(pol_yr) |> trx_stats() autoplot(trx_res)
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") study_py <- study_py |> add_transactions(withdrawals) exp_res <- study_py |> group_by(pol_yr) |> exp_stats() autoplot(exp_res) trx_res <- study_py |> group_by(pol_yr) |> trx_stats() autoplot(trx_res)
autotable()
is a generic function used to create a table
from an object of a particular class. Tables are constructed using the
gt
package.
autotable.exp_df()
is used to convert experience study results to a
presentation-friendly format.
autotable.trx_df()
is used to convert transaction study results to a
presentation-friendly format.
autotable(object, ...) ## S3 method for class 'exp_df' autotable( object, fontsize = 100, decimals = 1, colorful = TRUE, color_q_obs = "RColorBrewer::GnBu", color_ae_ = "RColorBrewer::RdBu", rename_cols = rlang::list2(...), show_conf_int = FALSE, show_cred_adj = FALSE, decimals_amt = 0, suffix_amt = FALSE, ... ) ## S3 method for class 'trx_df' autotable( object, fontsize = 100, decimals = 1, colorful = TRUE, color_util = "RColorBrewer::GnBu", color_pct_of = "RColorBrewer::RdBu", rename_cols = rlang::list2(...), show_conf_int = FALSE, decimals_amt = 0, suffix_amt = FALSE, ... )
autotable(object, ...) ## S3 method for class 'exp_df' autotable( object, fontsize = 100, decimals = 1, colorful = TRUE, color_q_obs = "RColorBrewer::GnBu", color_ae_ = "RColorBrewer::RdBu", rename_cols = rlang::list2(...), show_conf_int = FALSE, show_cred_adj = FALSE, decimals_amt = 0, suffix_amt = FALSE, ... ) ## S3 method for class 'trx_df' autotable( object, fontsize = 100, decimals = 1, colorful = TRUE, color_util = "RColorBrewer::GnBu", color_pct_of = "RColorBrewer::RdBu", rename_cols = rlang::list2(...), show_conf_int = FALSE, decimals_amt = 0, suffix_amt = FALSE, ... )
object |
An object of class |
... |
Additional arguments passed to |
fontsize |
Font size percentage multiplier. |
decimals |
Number of decimals to display for percentages |
colorful |
If |
color_q_obs |
Color palette used for the observed termination rate. |
color_ae_ |
Color palette used for actual-to-expected rates. |
rename_cols |
An optional list consisting of key-value pairs. This
can be used to relabel columns on the output table. This parameter is most
useful for renaming grouping variables that will appear under their original
variable names if left unchanged. See |
show_conf_int |
If |
show_cred_adj |
If |
decimals_amt |
Number of decimals to display for amount columns (number of claims, claim amounts, exposures, transaction counts, total transactions, and average transactions) |
suffix_amt |
This argument has the same meaning as the |
color_util |
Color palette used for utilization rates. |
color_pct_of |
Color palette used for "percentage of" columns. |
The color_q_obs
, color_ae_
, color_util
, and color_pct_of
arguments
must be strings referencing a discrete color palette available in the
paletteer
package. Palettes must be in the form "package::palette".
For a full list of available palettes, see paletteer::palettes_d_names.
a gt
object
if (interactive()) { study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) exp_res <- study_py |> group_by(pol_yr) |> exp_stats(expected = c("expected_1", "expected_2"), credibility = TRUE, conf_int = TRUE) autotable(exp_res) trx_res <- study_py |> group_by(pol_yr) |> trx_stats(percent_of = "av_anniv", conf_int = TRUE) autotable(trx_res) }
if (interactive()) { study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) exp_res <- study_py |> group_by(pol_yr) |> exp_stats(expected = c("expected_1", "expected_2"), credibility = TRUE, conf_int = TRUE) autotable(exp_res) trx_res <- study_py |> group_by(pol_yr) |> trx_stats(percent_of = "av_anniv", conf_int = TRUE) autotable(trx_res) }
Launch a Shiny application to interactively explore drivers of experience.
dat
must be an exposed_df
object. An error will be thrown is any other
object type is passed. If dat
has transactions attached, the app will
contain features for both termination and transaction studies. Otherwise,
the app will only support termination studies.
If nothing is passed to predictors
, all columns names in dat
will be
used (excluding the policy number, status, termination date, exposure,
transaction counts, and transaction amounts columns).
The expected
argument is optional. As a default, any column names
containing the word "expected" are used.
exp_shiny( dat, predictors = names(dat), expected = names(dat)[grepl("expected", names(dat))], distinct_max = 25L, title, credibility = TRUE, conf_level = 0.95, cred_r = 0.05, theme = "shiny", col_exposure = "exposure" )
exp_shiny( dat, predictors = names(dat), expected = names(dat)[grepl("expected", names(dat))], distinct_max = 25L, title, credibility = TRUE, conf_level = 0.95, cred_r = 0.05, theme = "shiny", col_exposure = "exposure" )
dat |
An |
predictors |
A character vector of independent variables in |
expected |
A character vector of expected values in |
distinct_max |
Maximum number of distinct values allowed for
|
title |
Optional. Title of the Shiny app. If no title is provided,
a descriptive title will be generated based on attributes of |
credibility |
If |
conf_level |
Confidence level used for the Limited Fluctuation credibility method and confidence intervals |
cred_r |
Error tolerance under the Limited Fluctuation credibility method |
theme |
The name of a theme passed to the |
col_exposure |
Name of the column in |
No return value. This function is called for the side effect of launching a Shiny application.
The sidebar contains filtering widgets organized by data type for all
variables passed to the predictors
argument.
At the top of the sidebar, information is shown on the percentage of records remaining after applying filters. A description of all active filters is also provided.
The top of the sidebar also includes a "play / pause" switch that can pause reactivity of the application. Pausing is a good option when multiple changes are made in quick succession, especially when the underlying data set is large.
This box includes widgets to select grouping variables for summarizing experience. The "x" widget determines the x variable in the plot output. Similarly, the "Color" and "Facets" widgets are used for color and facets. Multiple faceting variable selections are allowed. For the table output, "x", "Color", and "Facets" have no particular meaning beyond the order in which grouping variables are displayed.
This box includes a toggle to switch between termination studies and transaction studies (if available). Different options are available for each study type.
The expected values checkboxes are used to activate and deactivate expected
values passed to the expected
argument. This impacts the table output
directly and the available "y" variables for the plot. If there are no
expected values available, this widget will not appear. The "Weight by"
widget is used to specify which column, if any, contains weights for
summarizing experience.
The transaction types checkboxes are used to activate and deactivate
transaction types that appear in the plot and table outputs. The available
transaction types are taken from the trx_types
attribute of dat
.
In the plot output, transaction type will always appear as a faceting
variable. The "Transactions as % of" selector will expand the list of
available "y" variables for the plot and impact the table output directly.
Lastly, a toggle exists that allows for all transaction types to be
aggregated into a single group.
This tab includes a plot and various options for customization:
y: y variable
Geometry: plotting geometry
Second y-axis: activate to enable a second y-axis
Second axis y: y variable to plot on the second axis
Add Smoothing: activate to plot loess curves
Confidence intervals: If available, add error bars for confidence intervals around the selected y variable
Free y Scales: activate to enable separate y scales in each plot
Log y-axis: activate to plot all y-axes on a log-10 scale
The gear icon above the plot contains a pop-up menu that can be used to change the size of the plot for exporting.
This tab includes a data table.
The gear icon above the table contains a pop-up menu that can be used to change the appearance of the table:
The "Confidence intervals" and "Credibility-weighted termination rates" switches add these outputs to the table. These values are hidden as a default to prevent over-crowding.
The "Include color scales" switch disables or re-enables conditional color formatting.
The "Decimals" slider controls the number of decimals displayed for percentage fields.
The "Font size multiple" slider impacts the table's font size
This pop-up menu contains options for saving summarized experience data, the plot, or the table. Data is saved as a CSV file. The plot and table are saved as png files.
if (interactive()) { study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) exp_shiny(study_py) }
if (interactive()) { study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) exp_shiny(study_py) }
Create a summary data frame of termination experience for a given target status.
exp_stats( .data, target_status = attr(.data, "target_status"), expected, col_exposure = "exposure", col_status = "status", wt = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE ) ## S3 method for class 'exp_df' summary(object, ...)
exp_stats( .data, target_status = attr(.data, "target_status"), expected, col_exposure = "exposure", col_status = "status", wt = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE ) ## S3 method for class 'exp_df' summary(object, ...)
.data |
A data frame with exposure-level records, ideally of type
|
target_status |
A character vector of target status values |
expected |
A character vector containing column names in |
col_exposure |
Name of the column in |
col_status |
Name of the column in |
wt |
Optional. Length 1 character vector. Name of the column in
|
credibility |
If |
conf_level |
Confidence level used for the Limited Fluctuation credibility method and confidence intervals |
cred_r |
Error tolerance under the Limited Fluctuation credibility method |
conf_int |
If |
object |
An |
... |
Groups to retain after |
If .data
is grouped, the resulting data frame will contain
one row per group.
If target_status
isn't provided, exp_stats()
will use the same
target status from .data
if it has the class exposed_df
.
Otherwise, all status values except the first level will be assumed.
This will produce a warning message.
A tibble with class exp_df
, tbl_df
, tbl
,
and data.frame
. The results include columns for any grouping variables,
claims, exposures, and observed termination rates (q_obs
).
If any values are passed to expected
, expected termination rates and
actual-to-expected ratios.
If credibility
is set to TRUE
, additional columns are added
for partial credibility and credibility-weighted termination rates
(assuming values are passed to expected
). Credibility-weighted termination
rates are prefixed by adj_
.
If conf_int
is set to TRUE
, additional columns are added for lower and
upper confidence interval limits around the observed termination rates and
any actual-to-expected ratios. Additionally, if credibility
is TRUE
and
expected values are passed to expected
, the output will contain confidence
intervals around credibility-weighted termination rates. Confidence interval
columns include the name of the original output column suffixed by either
_lower
or _upper
.
If a value is passed to wt
, additional columns are created containing
the the sum of weights (.weight
), the sum of squared weights
(.weight_qs
), and the number of records (.weight_n
).
The expected
argument is optional. If provided, this argument must
be a character vector with values corresponding to columns in .data
containing expected experience. More than one expected basis can be provided.
If credibility
is set to TRUE
, the output will contain a
credibility
column equal to the partial credibility estimate under
the Limited Fluctuation credibility method (also known as Classical
Credibility) assuming a binomial distribution of claims.
If conf_int
is set to TRUE
, the output will contain lower and upper
confidence interval limits for the observed termination rate and any
actual-to-expected ratios. The confidence level is dictated
by conf_level
. If no weighting variable is passed to wt
, confidence
intervals will be constructed assuming a binomial distribution of claims.
Otherwise, confidence intervals will be calculated assuming that the
aggregate claims distribution is normal with a mean equal to observed claims
and a variance equal to:
Var(S) = E(N) * Var(X) + E(X)^2 * Var(N)
,
Where S
is the aggregate claim random variable, X
is the weighting
variable assumed to follow a normal distribution, and N
is a binomial
random variable for the number of claims.
If credibility
is TRUE
and expected values are passed to expected
,
the output will also contain confidence intervals for any
credibility-weighted termination rates.
summary()
MethodApplying summary()
to a exp_df
object will re-summarize the
data while retaining any grouping variables passed to the "dots"
(...
).
Herzog, Thomas (1999). Introduction to Credibility Theory
toy_census |> expose("2022-12-31", target_status = "Surrender") |> exp_stats() exp_res <- census_dat |> expose("2019-12-31", target_status = "Surrender") |> group_by(pol_yr, inc_guar) |> exp_stats() exp_res summary(exp_res) summary(exp_res, inc_guar)
toy_census |> expose("2022-12-31", target_status = "Surrender") |> exp_stats() exp_res <- census_dat |> expose("2019-12-31", target_status = "Surrender") |> group_by(pol_yr, inc_guar) |> exp_stats() exp_res summary(exp_res) summary(exp_res, inc_guar)
Convert a data frame of census-level records to exposure-level records.
expose( .data, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, cal_expo = FALSE, expo_length = c("year", "quarter", "month", "week"), col_pol_num = "pol_num", col_status = "status", col_issue_date = "issue_date", col_term_date = "term_date", default_status ) expose_py(...) expose_pq(...) expose_pm(...) expose_pw(...) expose_cy(...) expose_cq(...) expose_cm(...) expose_cw(...)
expose( .data, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, cal_expo = FALSE, expo_length = c("year", "quarter", "month", "week"), col_pol_num = "pol_num", col_status = "status", col_issue_date = "issue_date", col_term_date = "term_date", default_status ) expose_py(...) expose_pq(...) expose_pm(...) expose_pw(...) expose_cy(...) expose_cq(...) expose_cm(...) expose_cw(...)
.data |
A data frame with census-level records |
end_date |
Experience study end date |
start_date |
Experience study start date. Default value = 1900-01-01. |
target_status |
Character vector of target status values. Default value
= |
cal_expo |
Set to TRUE for calendar year exposures. Otherwise policy year exposures are assumed. |
expo_length |
Exposure period length |
col_pol_num |
Name of the column in |
col_status |
Name of the column in |
col_issue_date |
Name of the column in |
col_term_date |
Name of the column in |
default_status |
Optional scalar character representing the default active status code. If not provided, the most common status is assumed. |
... |
Arguments passed to |
Census-level data refers to a data set wherein there is one row per unique policy. Exposure-level data expands census-level data such that there is one record per policy per observation period. Observation periods could be any meaningful period of time such as a policy year, policy month, calendar year, calendar quarter, calendar month, etc.
target_status
is used in the calculation of exposures. The annual
exposure method is applied, which allocates a full period of exposure for
any statuses in target_status
. For all other statuses, new entrants
and exits are partially exposed based on the time elapsed in the observation
period. This method is consistent with the Balducci Hypothesis, which assumes
that the probability of termination is proportionate to the time elapsed
in the observation period. If the annual exposure method isn't desired,
target_status
can be ignored. In this case, partial exposures are
always applied regardless of status.
default_status
is used to indicate the default active status that
should be used when exposure records are created.
A tibble with class exposed_df
, tbl_df
, tbl
,
and data.frame
. The results include all existing columns in
.data
plus new columns for exposures and observation periods. Observation
periods include counters for policy exposures, start dates, and end dates.
Both start dates and end dates are inclusive bounds.
For policy year exposures, two observation period columns are returned.
Columns beginning with (pol_
) are integer policy periods. Columns
beginning with (pol_date_
) are calendar dates representing
anniversary dates, monthiversary dates, etc.
The functions expose_py()
, expose_pq()
, expose_pm()
,
expose_pw()
, expose_cy()
, expose_cq()
,
expose_cm()
, expose_cw()
are convenience functions for
specific implementations of expose()
. The two characters after the
underscore describe the exposure type and exposure period, respectively.
For exposures types:
p
refers to policy years
c
refers to calendar years
For exposure periods:
y
= years
q
= quarters
m
= months
w
= weeks
All columns containing dates must be in YYYY-MM-DD format.
Atkinson and McGarry (2016). Experience Study Calculations. https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf
expose_split()
for information on splitting calendar year
exposures by policy year.
toy_census |> expose("2020-12-31") census_dat |> expose_py("2019-12-31", target_status = "Surrender")
toy_census |> expose("2020-12-31") census_dat |> expose_py("2019-12-31", target_status = "Surrender")
Split calendar period exposures that cross a policy anniversary into a pre-anniversary record and a post-anniversary record.
After splitting the data, the resulting data frame will contain both calendar
exposures and policy year exposures. These columns will be named
exposure_cal
and exposure_pol
, respectively. Calendar exposures will be
in the original units passed to expose_split()
. Policy exposures will
always be expressed in years.
After splitting exposures, downstream functions like exp_stats()
and
exp_shiny()
will require clarification as to which exposure basis should
be used to summarize results.
is_split_exposed_df()
will return TRUE
if x
is a split_exposed_df
object.
expose_split(.data) is_split_exposed_df(x)
expose_split(.data) is_split_exposed_df(x)
.data |
An |
x |
Any object |
.data
must be an exposed_df
with calendar year, quarter, month,
or week exposure records. Calendar year exposures are created by the
functions expose_cy()
, expose_cq()
, expose_cm()
, or expose_cw()
, (or
expose()
when cal_expo = TRUE
).
For expose_split()
, a tibble with class split_exposed_df
,
exposed_df
, tbl_df
, tbl
, and data.frame
. The results include all
columns in .data
except that exposure
has been renamed to exposure_cal
.
Additional columns include:
exposure_pol
- policy year exposures
pol_yr
- policy year
For is_split_exposed_df()
, a length-1 logical vector.
expose()
for information on creating exposure records from census
data.
toy_census |> expose_cy("2022-12-31") |> expose_split()
toy_census |> expose_cy("2022-12-31") |> expose_split()
Test for and coerce to the exposed_df
class.
is_exposed_df(x) as_exposed_df( x, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, cal_expo = FALSE, expo_length = c("year", "quarter", "month", "week"), trx_types = NULL, col_pol_num, col_status, col_exposure, col_pol_per, cols_dates, col_trx_n_ = "trx_n_", col_trx_amt_ = "trx_amt_", default_status )
is_exposed_df(x) as_exposed_df( x, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, cal_expo = FALSE, expo_length = c("year", "quarter", "month", "week"), trx_types = NULL, col_pol_num, col_status, col_exposure, col_pol_per, cols_dates, col_trx_n_ = "trx_n_", col_trx_amt_ = "trx_amt_", default_status )
x |
An object. For |
end_date |
Experience study end date |
start_date |
Experience study start date. Default value = 1900-01-01. |
target_status |
Character vector of target status values. Default value
= |
cal_expo |
Set to TRUE for calendar year exposures. Otherwise policy year exposures are assumed. |
expo_length |
Exposure period length |
trx_types |
Optional. Character vector containing unique transaction
types that have been attached to |
col_pol_num |
Optional. Name of the column in |
col_status |
Optional. Name of the column in |
col_exposure |
Optional. Name of the column in |
col_pol_per |
Optional. Name of the column in |
cols_dates |
Optional. Names of the columns in |
col_trx_n_ |
Optional. Prefix to use for columns containing transaction counts. |
col_trx_amt_ |
Optional. Prefix to use for columns containing transaction amounts. |
default_status |
Optional scalar character representing the default active status code. If not provided, the most common status is assumed. |
is_exposed_df()
will return TRUE
if x
is an exposed_df
object.
as_exposed_df()
will coerce a data frame to an exposed_df
object if that
data frame has columns for policy numbers, statuses, exposures,
policy periods (for policy exposures only), and exposure start / end dates.
Optionally, if x
has transaction counts and amounts by type, these can
be specified without calling add_transactions()
.
For is_exposed_df()
, a length-1 logical vector. For
as_exposed_df()
, an exposed_df
object.
expose()
for information on how exposed_df
objects are typically
created from census data.
These functions create additional experience study plots that are not
available or difficult to produce using the autoplot.exp_df()
function.
plot_termination_rates(object, ..., include_cred_adj = FALSE) plot_actual_to_expected(object, ..., add_hline = TRUE)
plot_termination_rates(object, ..., include_cred_adj = FALSE) plot_actual_to_expected(object, ..., add_hline = TRUE)
object |
An object of class |
... |
Additional arguments passed to |
include_cred_adj |
If |
add_hline |
If |
plot_termination_rates()
- Create a plot of observed termination rates
and any expected termination rates attached to an exp_df
object.
plot_actual_to_expected()
- Create a plot of actual-to-expected termination
rates attached to an exp_df
object.
a ggplot
object
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) exp_res <- study_py |> group_by(pol_yr) |> exp_stats(expected = c("expected_1", "expected_2")) plot_termination_rates(exp_res) plot_actual_to_expected(exp_res)
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) study_py <- study_py |> mutate(expected_1 = expected_table[pol_yr], expected_2 = ifelse(inc_guar, 0.015, 0.03)) exp_res <- study_py |> group_by(pol_yr) |> exp_stats(expected = c("expected_1", "expected_2")) plot_termination_rates(exp_res) plot_actual_to_expected(exp_res)
These functions create additional experience study plots that are not
available or difficult to produce using the autoplot.trx_df()
function.
plot_utilization_rates(object, ...)
plot_utilization_rates(object, ...)
object |
An object of class |
... |
Additional arguments passed to |
plot_utilization_rates()
- Create a plot of transaction frequency and
severity. Frequency is represented by utilization rates (trx_util
).
Severity is represented by transaction amounts as a percentage of one or
more other columns in the data ({*}_w_trx
). All severity series begin with
the prefix "pct_of_" and end with the suffix "_w_trx". The suffix refers to
the fact that the denominator only includes records with non-zero
transactions. Severity series are based on column names passed to the
percent_of
argument in trx_stats()
. If no "percentage of" columns exist
in object
, this function will only plot utilization rates.
a ggplot
object
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) trx_res <- study_py |> group_by(pol_yr) |> trx_stats(percent_of = "av_anniv", combine_trx = TRUE) plot_utilization_rates(trx_res)
study_py <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") |> add_transactions(withdrawals) |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) trx_res <- study_py |> group_by(pol_yr) |> trx_stats(percent_of = "av_anniv", combine_trx = TRUE) plot_utilization_rates(trx_res)
Given a vector of dates and a vector of issue dates, calculate policy years, quarters, months, or weeks.
pol_yr(x, issue_date) pol_qtr(x, issue_date) pol_mth(x, issue_date) pol_wk(x, issue_date)
pol_yr(x, issue_date) pol_qtr(x, issue_date) pol_mth(x, issue_date) pol_wk(x, issue_date)
x |
A vector of dates |
issue_date |
A vector of issue dates |
These functions assume the first day of each policy year is the anniversary date (or issue date in the first year). The last day of each policy year is the day before the next anniversary date. Analogous rules are used for policy quarters, policy months, and policy weeks.
An integer vector
pol_yr(as.Date("2021-02-28") + 0:2, "2020-02-29") pol_mth(as.Date("2021-02-28") + 0:2, "2020-02-29")
pol_yr(as.Date("2021-02-28") + 0:2, "2020-02-29") pol_mth(as.Date("2021-02-28") + 0:2, "2020-02-29")
Mortality rates and mortality improvement rates from the 2012 Individual Annuity Mortality Basic (IAMB) Table and Projection Scale G2.
qx_iamb scale_g2
qx_iamb scale_g2
For the 2012 IAMB table, a data frame with 242 rows and 3 columns:
Attained age
Mortality rate
Female or Male
For the Projection Scale G2 table, a data frame with 242 rows and 3 columns:
Attained age
Mortality improvement rate
Female or Male
Simulated data for a theoretical deferred annuity product with an optional guaranteed income rider. This data is theoretical only and does not represent the experience on any specific product.
census_dat withdrawals account_vals
census_dat withdrawals account_vals
Three data frames containing census records (census_dat
),
withdrawal transactions (withdrawals
), and historical account values
(account_vals
).
An object of class tbl_df
(inherits from tbl
, data.frame
) with 20000 rows and 11 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 160130 rows and 4 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 141252 rows and 3 columns.
census_dat
)Policy number
Policy status: Active, Surrender, or Death
Issue date
Indicates whether the policy was issued with an income guarantee
Indicates whether the policy was purchased with tax-qualified funds
Issue age
Product: a, b, or c
M (Male) or F (Female)
Age that withdrawals commence
Single premium deposit
Termination date upon death or surrender
withdrawals
)Policy number
Withdrawal transaction date
Withdrawal transaction type, either Base or Rider
Withdrawal transaction amount
account_vals
)Policy number
Policy anniversary date (beginning of year)
Account value on the policy anniversary date
recipes
stepstep_expose()
creates a specification of a recipe step that will convert
a data frame of census-level records to exposure-level records.
step_expose( recipe, ..., role = NA, trained = FALSE, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, options = list(cal_expo = FALSE, expo_length = "year"), drop_pol_num = TRUE, skip = TRUE, id = recipes::rand_id("expose") )
step_expose( recipe, ..., role = NA, trained = FALSE, end_date, start_date = as.Date("1900-01-01"), target_status = NULL, options = list(cal_expo = FALSE, expo_length = "year"), drop_pol_num = TRUE, skip = TRUE, id = recipes::rand_id("expose") )
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose variables
for this step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
end_date |
Experience study end date |
start_date |
Experience study start date. Default value = 1900-01-01. |
target_status |
Character vector of target status values. Default value
= |
options |
A named list of additional arguments passed to |
drop_pol_num |
Whether the |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
Policy year exposures are calculated as a default. To switch to calendar
exposures or another exposure length, use pass the appropriate arguments to
the options
parameter.
Policy numbers are dropped as a default whenever the recipe is baked. This
is done to prevent unintentional errors when the model formula includes
all variables (y ~ .
). If policy numbers are required for any reason
(mixed effect models, identification, etc.), set drop_pol_num
to FALSE
.
An updated version of recipe
with the new expose step added to the
sequence of any existing operations. For the tidy
method, a tibble
with
the columns exposure_type
, target_status
, start_date
, and end_date
.
expo_rec <- recipes::recipe(status ~ ., toy_census) |> step_expose(end_date = "2022-12-31", target_status = "Surrender", options = list(expo_length = "month")) |> prep() recipes::juice(expo_rec)
expo_rec <- recipes::recipe(status ~ ., toy_census) |> step_expose(end_date = "2022-12-31", target_status = "Surrender", options = list(expo_length = "month")) |> prep() recipes::juice(expo_rec)
Create a summary data frame of termination experience for a given target status.
## S3 method for class 'exposed_df' summary(object, ...)
## S3 method for class 'exposed_df' summary(object, ...)
object |
A data frame with exposure-level records |
... |
Additional arguments passed to |
Calling summary()
on an exposed_df
object will summarize results using
exp_stats()
. See exp_stats()
for more information.
A tibble with class exp_df
, tbl_df
, tbl
,
and data.frame
.
toy_census |> expose("2022-12-31", target_status = "Surrender") |> summary()
toy_census |> expose("2022-12-31", target_status = "Surrender") |> summary()
A tiny dataset containing 3 policies: one active, one terminated due to death, and one terminated due to surrender.
toy_census
toy_census
A data frame with 3 rows and 4 columns:
Policy number
Policy status
Issue date
Termination date
Create a summary data frame of transaction counts, amounts, and utilization rates.
trx_stats( .data, trx_types, percent_of = NULL, combine_trx = FALSE, col_exposure = "exposure", full_exposures_only = TRUE, conf_int = FALSE, conf_level = 0.95 ) ## S3 method for class 'trx_df' summary(object, ...)
trx_stats( .data, trx_types, percent_of = NULL, combine_trx = FALSE, col_exposure = "exposure", full_exposures_only = TRUE, conf_int = FALSE, conf_level = 0.95 ) ## S3 method for class 'trx_df' summary(object, ...)
.data |
A data frame with exposure-level records of type
|
trx_types |
A character vector of transaction types to include in the
output. If none is provided, all available transaction types in |
percent_of |
A optional character vector containing column names in
|
combine_trx |
If |
col_exposure |
Name of the column in |
full_exposures_only |
If |
conf_int |
If |
conf_level |
Confidence level for confidence intervals |
object |
A |
... |
Groups to retain after |
Unlike exp_stats()
, this function requires data
to be an
exposed_df
object.
If .data
is grouped, the resulting data frame will contain
one row per transaction type per group.
Any number of transaction types can be passed to the trx_types
argument,
however each transaction type must appear in the trx_types
attribute of
.data
. In addition, trx_stats()
expects to see columns named trx_n_{*}
(for transaction counts) and trx_amt_{*}
for (transaction amounts) for each
transaction type. To ensure .data
is in the appropriate format, use the
functions as_exposed_df()
to convert an existing data frame with
transactions or add_transactions()
to attach transactions to an existing
exposed_df
object.
A tibble with class trx_df
, tbl_df
, tbl
,
and data.frame
. The results include columns for any grouping
variables and transaction types, plus the following:
trx_n
: the number of unique transactions.
trx_amt
: total transaction amount
trx_flag
: the number of observation periods with non-zero transaction amounts.
exposure
: total exposures
avg_trx
: mean transaction amount (trx_amt / trx_flag
)
avg_all
: mean transaction amount over all records (trx_amt / exposure
)
trx_freq
: transaction frequency when a transaction occurs (trx_n / trx_flag
)
trx_utilization
: transaction utilization per observation period (trx_flag / exposure
)
If percent_of
is provided, the results will also include:
The sum of any columns passed to percent_of
with non-zero transactions.
These columns include the suffix _w_trx
.
The sum of any columns passed to percent_of
pct_of_{*}_w_trx
: total transactions as a percentage of column
{*}_w_trx
. In other words, total transactions divided by the sum of a
column including only records utilizing transactions.
pct_of_{*}_all
: total transactions as a percentage of column {*}
. In
other words, total transactions divided by the sum of a column regardless
of whether or not transactions were utilized.
If conf_int
is set to TRUE
, additional columns are added for lower and
upper confidence interval limits around the observed utilization rate and any
percent_of
output columns. Confidence interval columns include the name
of the original output column suffixed by either _lower
or _upper
.
If values are passed to percent_of
, an additional column is created
containing the the sum of squared transaction amounts (trx_amt_sq
).
The percent_of
argument is optional. If provided, this argument must
be a character vector with values corresponding to columns in .data
containing values to use as denominators in the calculation of utilization
rates or actual-to-expected ratios. Example usage:
In a study of partial withdrawal transactions, if percent_of
refers to
account values, observed withdrawal rates can be determined.
In a study of recurring claims, if percent_of
refers to a column
containing a maximum benefit amount, utilization rates can be determined.
If conf_int
is set to TRUE
, the output will contain lower and upper
confidence interval limits for the observed utilization rate and any
percent_of
output columns. The confidence level is dictated
by conf_level
.
Intervals for the utilization rate (trx_util
) assume a binomial
distribution.
Intervals for transactions as a percentage of another column with
non-zero transactions (pct_of_{*}_w_trx
) are constructed using a normal
distribution
Intervals for transactions as a percentage of another column
regardless of transaction utilization (pct_of_{*}_all
) are calculated
assuming that the aggregate distribution is normal with a mean equal to
observed transactions and a variance equal to:
Var(S) = E(N) * Var(X) + E(X)^2 * Var(N)
,
Where S
is the aggregate transactions random variable, X
is an individual
transaction amount assumed to follow a normal distribution, and N
is a
binomial random variable for transaction utilization.
As a default, partial exposures are removed from .data
before summarizing
results. This is done to avoid complexity associated with a lopsided skew
in the timing of transactions. For example, if transactions can occur on a
monthly basis or annually at the beginning of each policy year, partial
exposures may not be appropriate. If a policy had an exposure of 0.5 years
and was taking withdrawals annually at the beginning of the year, an
argument could be made that the exposure should instead be 1 complete year.
If the same policy was expected to take withdrawals 9 months into the year,
it's not clear if the exposure should be 0.5 years or 0.5 / 0.75 years.
To override this treatment, set full_exposures_only
to FALSE
.
summary()
MethodApplying summary()
to a trx_df
object will re-summarize the
data while retaining any grouping variables passed to the "dots"
(...
).
expo <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") |> add_transactions(withdrawals) res <- expo |> group_by(inc_guar) |> trx_stats(percent_of = "premium") res summary(res) expo |> group_by(inc_guar) |> trx_stats(percent_of = "premium", combine_trx = TRUE, conf_int = TRUE)
expo <- expose_py(census_dat, "2019-12-31", target_status = "Surrender") |> add_transactions(withdrawals) res <- expo |> group_by(inc_guar) |> trx_stats(percent_of = "premium") res summary(res) expo |> group_by(inc_guar) |> trx_stats(percent_of = "premium", combine_trx = TRUE, conf_int = TRUE)