Installing the latest developing version.
library(devtools)
install_github("NSAPH-Software/CRE", ref="develop")
library("CRE")Installing from CRAN.
install.packages("CRE")Input parameters:
y The observed response (outcome)
vector.
z The treatment (exposure, policy)
vector.
X The covariate matrix.
method_params Parameters for estimating
the individual treatment effect, including:
Parameters for Discovery
- ratio_dis The ratio of data delegated to
the discovery sub-sample.
- ite_method_dis The method to estimate
the discovery sub-sample in estimating individual treatment effect
(ITE).
- include_ps_dis Whether or not to include
propensity score estimate as a covariate in discovery ITE estimation,
considered only for BART, or CF.
- ps_method_dis The estimation model for
the propensity score on the discovery sub-sample.
- or_method_dis The estimation model for
the outcome regressions estimate_ite_aipw on the discovery
sub-sample.
Parameters for Inference
- ite_method_inf The method to estimate
the inference sample ITE.
- include_ps_inf Whether or not to include
propensity score estimate as a covariate in inference ITE estimation,
considered only for BART, or CF.
- ps_method_inf The estimation model for
the propensity score on the inference subsample.
- or_method_inf The estimation model for
the outcome regressions in estimate_ite_aipw on the inference
subsample.
Other Parameters -
include_offset Whether or not to include
an offset when estimating the ITE, for Poisson only.
- offset_name The name of the offset, if
it is to be included.
- cate_method The method to estimate the
conditional average treatment effect (CATE) values.
- cate_SL_library The library used if
cate_method is set to DRLearner.
- filter_cate Whether or not to filter
rules with p-value <= 0.05.
hyper_params The list of parameters
required to tune the functions, including:
- effect_modifiers Effect Modifiers for
Rules Generation.
- ntrees_rf The number of decision trees
for randomForest.
- ntrees_gbm The number of decision trees
for gradient boosting.
- node_size The minimum size of the trees’
terminal nodes.
- max_nodes The maximum number of terminal
nodes trees in the forest can have.
- max_depth The number of top levels from
each tree considered to extract conditions.
- replace Boolean variable for replacement
in bootstrapping.
- max_decay Decay Threshold for pruning
the rules.
- type_decay Decay Type for pruning the
rules (1: relative error; 2: error).
- t_anom The threshold to define too
generic or too specific (anomalous) rules.
- t_corr The threshold to define
correlated rules.
- q Number of (unique) selected rules per
subsample in stability selection.
- stability_selection Whether or not using
stability selection for selecting the causal rules.
- pfer_val The Per-Family Error Rate, the
expected number of false discoveries.
(1) Options for the ITE estimation are as follows:
ipw)sipw)aipw)oreg)bart)bcf)cf)poisson)(2) The include_ps_dis and
include_ps_inf arguments will only be considered if the ITE
method selected is bart, cf.
The CRE package can generate synthetic data that can be used to test different features of the package. At the current implementation, the code can generate data with continuous or binary outcomes.
set.seed(9687)
dataset_cont <- generate_cre_dataset(n = 300, rho = 0, n_rules = 2, p = 10,
effect_size = 2, binary = FALSE)
y <- dataset_cont[["y"]]
z <- dataset_cont[["z"]]
X <- as.data.frame(dataset_cont[["X"]])
X_names <- names(as.data.frame(X))
method_params = list(ratio_dis = 0.25,
ite_method_dis="bart",
ps_method_dis = "SL.xgboost",
oreg_method_dis = "SL.xgboost",
include_ps_dis = TRUE,
ite_method_inf = "bart",
ps_method_inf = "SL.xgboost",
oreg_method_inf = "SL.xgboost",
include_ps_inf = TRUE,
include_offset = FALSE,
cate_method = "DRLearner",
cate_SL_library = "SL.xgboost",
filter_cate = FALSE,
offset_name = NA,
random_state = 3591)
hyper_params = list(ntrees_rf = 100,
ntrees_gbm = 50,
node_size = 20,
max_nodes = 5,
max_depth = 15,
max_decay = 0.025,
type_decay = 2,
t_anom = 0.025,
t_corr = 1,
replace = FALSE,
q = 0.8,
stability_selection = TRUE,
pfer_val = 0.1)
cre_results <- cre(y, z, X, method_params, hyper_params)