This article describes creating an ADSL ADaM. Examples
are currently presented and tested using DM,
EX , AE, LB and DS
SDTM domains. However, other domains could be used.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
APxxSDT, APxxEDT, …)TRT0xP, TRT0xA)TRTSDT, TRTEDT,
TRTDURD)LSTALVDT)To start, all data frames needed for the creation of
ADSL should be read into the environment. This will be a
company specific process. Some of the data frames needed may be
DM, EX, DS, AE, and
LB.
For example purpose, the CDISC Pilot SDTM datasets—which are included
in {admiral.test}—are used.
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(admiral.test)
library(lubridate)
library(stringr)
data("admiral_dm")
data("admiral_ds")
data("admiral_ex")
data("admiral_ae")
data("admiral_lb")
dm <- admiral_dm
ds <- admiral_ds
ex <- admiral_ex
ae <- admiral_ae
lb <- admiral_lbThe DM domain is used as the basis for
ADSL:
adsl <- dm %>%
  select(-DOMAIN)| USUBJID | RFSTDTC | COUNTRY | AGE | SEX | RACE | ETHNIC | ARM | ACTARM | 
|---|---|---|---|---|---|---|---|---|
| 01-701-1015 | 2014-01-02 | USA | 63 | F | WHITE | HISPANIC OR LATINO | Placebo | Placebo | 
| 01-701-1023 | 2012-08-05 | USA | 64 | M | WHITE | HISPANIC OR LATINO | Placebo | Placebo | 
| 01-701-1028 | 2013-07-19 | USA | 71 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline High Dose | Xanomeline High Dose | 
| 01-701-1033 | 2014-03-18 | USA | 74 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose | 
| 01-701-1034 | 2014-07-01 | USA | 77 | F | WHITE | NOT HISPANIC OR LATINO | Xanomeline High Dose | Xanomeline High Dose | 
| 01-701-1047 | 2013-02-12 | USA | 85 | F | WHITE | NOT HISPANIC OR LATINO | Placebo | Placebo | 
| 01-701-1057 | USA | 59 | F | WHITE | HISPANIC OR LATINO | Screen Failure | Screen Failure | |
| 01-701-1097 | 2014-01-01 | USA | 68 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose | 
| 01-701-1111 | 2012-09-07 | USA | 81 | F | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose | 
| 01-701-1115 | 2012-11-30 | USA | 84 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose | 
APxxSDT, APxxEDT, …)See the “Visit and Period Variables” vignette for more information.
If the variables are not derived based on a period reference dataset, they may be derived at a later point of the flow. For example, phases like “Treatment Phase” and “Follow up” could be derived based on treatment start and end date.
TRT0xP,
TRT0xA)The mapping of the treatment variables is left to the ADaM programmer. An example mapping for a study without periods may be:
adsl <- dm %>%
  mutate(TRT01P = ARM, TRT01A = ACTARM)For studies with periods see the “Visit and Period Variables” vignette.
TRTSDTM, TRTEDTM, TRTDURD)The function derive_vars_merged() can be used to derive
the treatment start and end date/times using the ex domain.
A pre-processing step for ex is required to convert the
variable EXSTDTC and EXSTDTC to datetime
variables and impute missing date or time components. Conversion and
imputation is done by derive_vars_dtm().
Example calls:
# impute start and end time of exposure to first and last respectively, do not impute date
ex_ext <- ex %>%
  derive_vars_dtm(
    dtc = EXSTDTC,
    new_vars_prefix = "EXST"
  ) %>%
  derive_vars_dtm(
    dtc = EXENDTC,
    new_vars_prefix = "EXEN",
    time_imputation = "last"
  )
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
    new_vars = vars(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
    order = vars(EXSTDTM, EXSEQ),
    mode = "first",
    by_vars = vars(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
    new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
    order = vars(EXENDTM, EXSEQ),
    mode = "last",
    by_vars = vars(STUDYID, USUBJID)
  )This call returns the original data frame with the column
TRTSDTM, TRTSTMF, TRTEDTM, and
TRTETMF added. Exposure observations with incomplete date
and zero doses of non placebo treatments are ignored. Missing time parts
are imputed as first or last for start and end date respectively.
The datetime variables returned can be converted to dates using the
derive_vars_dtm_to_dt() function.
adsl <- adsl %>%
  derive_vars_dtm_to_dt(source_vars = vars(TRTSDTM, TRTEDTM))Now, that TRTSDT and TRTEDT are derived,
the function derive_var_trtdurd() can be used to calculate
the Treatment duration (TRTDURD).
adsl <- adsl %>%
  derive_var_trtdurd()| USUBJID | RFSTDTC | TRTSDTM | TRTSDT | TRTEDTM | TRTEDT | TRTDURD | 
|---|---|---|---|---|---|---|
| 01-701-1015 | 2014-01-02 | 2014-01-02 | 2014-01-02 | 2014-07-02 23:59:59 | 2014-07-02 | 182 | 
| 01-701-1023 | 2012-08-05 | 2012-08-05 | 2012-08-05 | 2012-09-01 23:59:59 | 2012-09-01 | 28 | 
| 01-701-1028 | 2013-07-19 | 2013-07-19 | 2013-07-19 | 2014-01-14 23:59:59 | 2014-01-14 | 180 | 
| 01-701-1033 | 2014-03-18 | 2014-03-18 | 2014-03-18 | 2014-03-31 23:59:59 | 2014-03-31 | 14 | 
| 01-701-1034 | 2014-07-01 | 2014-07-01 | 2014-07-01 | 2014-12-30 23:59:59 | 2014-12-30 | 183 | 
| 01-701-1047 | 2013-02-12 | 2013-02-12 | 2013-02-12 | 2013-03-09 23:59:59 | 2013-03-09 | 26 | 
| 01-701-1057 | NA | NA | NA | NA | NA | |
| 01-701-1097 | 2014-01-01 | 2014-01-01 | 2014-01-01 | 2014-07-09 23:59:59 | 2014-07-09 | 190 | 
| 01-701-1111 | 2012-09-07 | 2012-09-07 | 2012-09-07 | 2012-09-16 23:59:59 | 2012-09-16 | 10 | 
| 01-701-1115 | 2012-11-30 | 2012-11-30 | 2012-11-30 | 2013-01-23 23:59:59 | 2013-01-23 | 55 | 
EOSDT)The functions derive_vars_dt() and
derive_vars_merged() can be used to derive a disposition
date. First the character disposition date (DS.DSSTDTC) is
converted to a numeric date (DSSTDT) calling
derive_vars_dt(). Then the relevant disposition date is
selected by adjusting the filter_add parameter.
To derive the End of Study date (EOSDT), a call could
be:
# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
  ds,
  dtc = DSSTDTC,
  new_vars_prefix = "DSST"
)
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    by_vars = vars(STUDYID, USUBJID),
    new_vars = vars(EOSDT = DSSTDT),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )| USUBJID | DSCAT | DSDECOD | DSTERM | DSSTDTC | 
|---|---|---|---|---|
| 01-701-1015 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2014-01-02 | 
| 01-701-1015 | DISPOSITION EVENT | COMPLETED | PROTOCOL COMPLETED | 2014-07-02 | 
| 01-701-1015 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2014-07-02 | 
| 01-701-1023 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2012-08-05 | 
| 01-701-1023 | DISPOSITION EVENT | ADVERSE EVENT | ADVERSE EVENT | 2012-09-02 | 
| 01-701-1023 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2012-09-02 | 
| 01-701-1023 | OTHER EVENT | FINAL RETRIEVAL VISIT | FINAL RETRIEVAL VISIT | 2013-02-18 | 
| 01-701-1028 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2013-07-19 | 
| 01-701-1028 | DISPOSITION EVENT | COMPLETED | PROTOCOL COMPLETED | 2014-01-14 | 
| 01-701-1028 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2014-01-14 | 
We would get :
| USUBJID | EOSDT | 
|---|---|
| 01-701-1015 | 2014-07-02 | 
| 01-701-1023 | 2012-09-02 | 
| 01-701-1028 | 2014-01-14 | 
| 01-701-1033 | 2014-04-14 | 
| 01-701-1034 | 2014-12-30 | 
| 01-701-1047 | 2013-03-29 | 
| 01-701-1057 | NA | 
| 01-701-1097 | 2014-07-09 | 
| 01-701-1111 | 2012-09-17 | 
| 01-701-1115 | 2013-01-23 | 
This call would return the input dataset with the column
EOSDT added. This function allows the user to impute
partial dates as well. If imputation is needed and the date is to be
imputed to the first of the month, then set
date_imputation = "FIRST".
EOSSTT)The function derive_var_disposition_status() can be used
to derive a disposition status at a specific timepoint. The relevant
disposition variable (DS.DSDECOD) is selected by adjusting
the filter parameter and used to derive EOSSTT.
To derive the End of Study status (EOSSTT), a call could
be:
adsl <- adsl %>%
  derive_var_disposition_status(
    dataset_ds = ds,
    new_var = EOSSTT,
    status_var = DSDECOD,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )| USUBJID | EOSDT | EOSSTT | 
|---|---|---|
| 01-701-1015 | 2014-07-02 | COMPLETED | 
| 01-701-1023 | 2012-09-02 | DISCONTINUED | 
| 01-701-1028 | 2014-01-14 | COMPLETED | 
| 01-701-1033 | 2014-04-14 | DISCONTINUED | 
| 01-701-1034 | 2014-12-30 | COMPLETED | 
| 01-701-1047 | 2013-03-29 | DISCONTINUED | 
| 01-701-1057 | NA | NOT STARTED | 
| 01-701-1097 | 2014-07-09 | COMPLETED | 
| 01-701-1111 | 2012-09-17 | DISCONTINUED | 
| 01-701-1115 | 2013-01-23 | DISCONTINUED | 
Link to DS.
This call would return the input dataset with the column
EOSSTT added.
By default, the function will derive EOSSTT as
"NOT STARTED" if DSDECOD is
"SCREEN FAILURE" or
"SCREENING NOT COMPLETED""COMPLETED" if DSDECOD == "COMPLETED""DISCONTINUED" if DSDECOD is not
"COMPLETED" or NA"ONGOING" otherwiseIf the default derivation must be changed, the user can create
his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping) to
map DSDECOD to a suitable EOSSTT value.
Example function format_eosstt():
format_eosstt <- function(DSDECOD) {
  case_when(
    DSDECOD %in% c("COMPLETED") ~ "COMPLETED",
    DSDECOD %in% c("SCREEN FAILURE") ~ NA_character_,
    !is.na(DSDECOD) ~ "DISCONTINUED",
    TRUE ~ "ONGOING"
  )
}The customized mapping function format_eosstt() can now
be passed to the main function:
adsl <- adsl %>%
  derive_var_disposition_status(
    dataset_ds = ds,
    new_var = EOSSTT,
    status_var = DSDECOD,
    format_new_var = format_eosstt,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )This call would return the input dataset with the column
EOSSTT added.
DCSREAS,
DCSREASP)The main reason for discontinuation is usually stored in
DSDECOD while DSTERM provides additional
details regarding subject’s discontinuation (e.g., description of
"OTHER").
The function derive_vars_disposition_reason() can be
used to derive a disposition reason (along with the details, if
required) at a specific timepoint. The relevant disposition variable(s)
(DS.DSDECOD, DS.DSTERM) are selected by
adjusting the filter parameter and used to derive the main reason (and
details).
To derive the End of Study reason(s) (DCSREAS and
DCSREASP), the call would be:
adsl <- adsl %>%
  derive_vars_disposition_reason(
    dataset_ds = ds,
    new_var = DCSREAS,
    reason_var = DSDECOD,
    new_var_spe = DCSREASP,
    reason_var_spe = DSTERM,
    filter_ds = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )| USUBJID | EOSDT | EOSSTT | DCSREAS | DCSREASP | 
|---|---|---|---|---|
| 01-701-1015 | 2014-07-02 | COMPLETED | NA | NA | 
| 01-701-1023 | 2012-09-02 | DISCONTINUED | ADVERSE EVENT | NA | 
| 01-701-1028 | 2014-01-14 | COMPLETED | NA | NA | 
| 01-701-1033 | 2014-04-14 | DISCONTINUED | STUDY TERMINATED BY SPONSOR | NA | 
| 01-701-1034 | 2014-12-30 | COMPLETED | NA | NA | 
| 01-701-1047 | 2013-03-29 | DISCONTINUED | ADVERSE EVENT | NA | 
| 01-701-1057 | NA | NOT STARTED | NA | NA | 
| 01-701-1097 | 2014-07-09 | COMPLETED | NA | NA | 
| 01-701-1111 | 2012-09-17 | DISCONTINUED | ADVERSE EVENT | NA | 
| 01-701-1115 | 2013-01-23 | DISCONTINUED | ADVERSE EVENT | NA | 
Link to DS.
This call would return the input dataset with the column
DCSREAS and DCSREASP added.
By default, the function will map
DCSREAS as DSDECOD if DSDECOD
is not "COMPLETED" or NA, NA
otherwiseDCSREASP as DSTERM if DSDECOD
is equal to OTHER, NA otherwiseIf the default derivation must be changed, the user can create
his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping) to
map DSDECOD and DSTERM to a suitable
DCSREAS/DCSREASP value.
Example function format_dcsreas():
format_dcsreas <- function(dsdecod, dsterm = NULL) {
  if (is.null(dsterm)) {
    if_else(dsdecod %notin% c("COMPLETED", "SCREEN FAILURE") & !is.na(dsdecod), dsdecod, NA_character_)
  } else {
    if_else(dsdecod == "OTHER", dsterm, NA_character_)
  }
}The customized mapping function format_dcsreas() can now
be passed to the main function:
adsl <- adsl %>%
  derive_vars_disposition_reason(
    dataset_ds = ds,
    new_var = DCSREAS,
    reason_var = DSDECOD,
    new_var_spe = DCSREASP,
    reason_var_spe = DSTERM,
    format_new_vars = format_dcsreas,
    filter_ds = DSCAT == "DISPOSITION EVENT"
  )RANDDT)The function derive_vars_merged() can be used to derive
randomization date variable. To map Randomization Date
(RANDDT), the call would be:
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    filter_add = DSDECOD == "RANDOMIZED",
    by_vars = vars(STUDYID, USUBJID),
    new_vars = vars(RANDDT = DSSTDT)
  )This call would return the input dataset with the column
RANDDT is added.
| USUBJID | RANDDT | 
|---|---|
| 01-701-1015 | 2014-01-02 | 
| 01-701-1023 | 2012-08-05 | 
| 01-701-1028 | 2013-07-19 | 
| 01-701-1033 | 2014-03-18 | 
| 01-701-1034 | 2014-07-01 | 
| 01-701-1047 | 2013-02-12 | 
| 01-701-1057 | NA | 
| 01-701-1097 | 2014-01-01 | 
| 01-701-1111 | 2012-09-07 | 
| 01-701-1115 | 2012-11-30 | 
Link to DS.
DTHDT)The function derive_vars_dt() can be used to derive
DTHDT. This function allows the user to impute the date as
well.
Example calls:
adsl <- adsl %>%
  derive_vars_dt(
    new_vars_prefix = "DTH",
    dtc = DTHDTC
  )| USUBJID | TRTEDT | DTHDTC | DTHDT | DTHFL | 
|---|---|---|---|---|
| 01-701-1015 | 2014-07-02 | NA | ||
| 01-701-1023 | 2012-09-01 | NA | ||
| 01-701-1028 | 2014-01-14 | NA | ||
| 01-701-1033 | 2014-03-31 | NA | ||
| 01-701-1034 | 2014-12-30 | NA | ||
| 01-701-1047 | 2013-03-09 | NA | ||
| 01-701-1057 | NA | NA | ||
| 01-701-1097 | 2014-07-09 | NA | ||
| 01-701-1111 | 2012-09-16 | NA | ||
| 01-701-1115 | 2013-01-23 | NA | 
This call would return the input dataset with the columns
DTHDT added and, by default, the associated date imputation
flag (DTHDTF) populated with the controlled terminology
outlined in the ADaM IG for date imputations. If the imputation flag is
not required, the user must set the argument
flag_imputation to “none”.
If imputation is needed and the date is to be imputed to the first day of the month/year the call would be:
adsl <- adsl %>%
  derive_vars_dt(
    new_vars_prefix = "DTH",
    dtc = DTHDTC,
    date_imputation = "first"
  )See also Date and Time Imputation.
DTHCAUS)The cause of death DTHCAUS can be derived using the
function derive_var_dthcaus().
Since the cause of death could be collected/mapped in different
domains (e.g. DS, AE, DD), it is
important the user specifies the right source(s) to derive the cause of
death from.
For example, if the date of death is collected in the AE form when
the AE is Fatal, the cause of death would be set to the preferred term
(AEDECOD) of that Fatal AE, while if the date of death is
collected in the DS form, the cause of death would be set
to the disposition term (DSTERM). To achieve this, the
dthcaus_source() objects must be specified and defined such
as it fits the study requirement.
dthcaus_source() specifications:
dataset_name: the name of the dataset where to search
for death information,filter: the condition to define death,date: the date of death,mode: first or last to select
the first/last date of death if multiple dates are collected,dthcaus: variable or text used to populate
DTHCAUS.traceability_vars: whether the traceability variables
need to be added (e.g source domain, sequence, variable)An example call to define the sources would be:
src_ae <- dthcaus_source(
  dataset_name = "ae",
  filter = AEOUT == "FATAL",
  date = AESTDTM,
  mode = "first",
  dthcaus = AEDECOD
)| USUBJID | AESTDTC | AEENDTC | AEDECOD | AEOUT | 
|---|---|---|---|---|
| 01-701-1211 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | FATAL | 
| 01-704-1445 | 2014-10-31 | 2014-10-31 | COMPLETED SUICIDE | FATAL | 
| 01-710-1083 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | FATAL | 
src_ds <- dthcaus_source(
  dataset_name = "ds",
  filter = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
  date = DSSTDT,
  mode = "first",
  dthcaus = "Death in DS"
)| USUBJID | DSDECOD | DSTERM | DSSTDTC | 
|---|---|---|---|
| 01-701-1211 | DEATH | DEATH | 2013-01-14 | 
| 01-704-1445 | DEATH | DEATH | 2014-11-01 | 
| 01-710-1083 | DEATH | DEATH | 2013-08-02 | 
Once the sources are defined, the function
derive_var_dthcaus() can be used to derive
DTHCAUS:
ae_ext <- derive_vars_dtm(
  ae,
  dtc = AESTDTC,
  new_vars_prefix = "AEST",
  highest_imputation = "M",
  flag_imputation = "none"
)
adsl <- adsl %>%
  derive_var_dthcaus(src_ae, src_ds, source_datasets = list(ae = ae_ext, ds = ds_ext))| USUBJID | EOSDT | DTHDTC | DTHDT | DTHCAUS | 
|---|---|---|---|---|
| 01-701-1211 | 2013-01-14 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | 
| 01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE | 
| 01-710-1083 | 2013-08-02 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | 
The function also offers the option to add some traceability
variables (e.g. DTHDOM would store the domain where the
date of death is collected, and DTHSEQ would store the
xxSEQ value of that domain). To add them, the
traceability_vars argument must be added to the
dthcaus_source() arguments:
src_ae <- dthcaus_source(
  dataset_name = "ae",
  filter = AEOUT == "FATAL",
  date = AESTDTM,
  mode = "first",
  dthcaus = AEDECOD,
  traceability_vars = vars(DTHDOM = "AE", DTHSEQ = AESEQ)
)
src_ds <- dthcaus_source(
  dataset_name = "ds",
  filter = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
  date = DSSTDT,
  mode = "first",
  dthcaus = DSTERM,
  traceability_vars = vars(DTHDOM = "DS", DTHSEQ = DSSEQ)
)
adsl <- adsl %>%
  select(-DTHCAUS) %>% # remove it before deriving it again
  derive_var_dthcaus(src_ae, src_ds, source_datasets = list(ae = ae_ext, ds = ds_ext))| USUBJID | TRTEDT | DTHDTC | DTHDT | DTHCAUS | DTHDOM | DTHSEQ | 
|---|---|---|---|---|---|---|
| 01-701-1211 | 2013-01-12 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | AE | 9 | 
| 01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE | AE | 1 | 
| 01-710-1083 | 2013-08-01 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | AE | 1 | 
The function derive_vars_duration() can be used to
derive duration relative to death like the Relative Day of Death
(DTHADY) or the numbers of days from last dose to death
(LDDTHELD).
Example calls:
adsl <- adsl %>%
  derive_vars_duration(
    new_var = DTHADY,
    start_date = TRTSDT,
    end_date = DTHDT
  )adsl <- adsl %>%
  derive_vars_duration(
    new_var = LDDTHELD,
    start_date = TRTEDT,
    end_date = DTHDT,
    add_one = FALSE
  )| USUBJID | TRTEDT | DTHDTC | DTHDT | DTHCAUS | DTHADY | LDDTHELD | 
|---|---|---|---|---|---|---|
| 01-701-1211 | 2013-01-12 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | 61 | 2 | 
| 01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE | 175 | 0 | 
| 01-710-1083 | 2013-08-01 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | 12 | 1 | 
LSTALVDT)Similarly as for the cause of death (DTHCAUS), the last
known alive date (LSTALVDT) can be derived from multiples
sources and the user must ensure the sources
(date_source()) are correctly defined.
date_source() specifications:
dataset_name: the name of the dataset where to search
for date information,filter: the filter to apply on the datasets,date: the date of interest,date_imputation: whether and how to impute partial
dates,traceability_vars: whether the traceability variables
need to be added (e.g source domain, sequence, variable)An example could be :
ae_start_date <- date_source(
  dataset_name = "ae",
  date = AESTDT
)
ae_end_date <- date_source(
  dataset_name = "ae",
  date = AEENDT
)
lb_date <- date_source(
  dataset_name = "lb",
  date = LBDT,
  filter = !is.na(LBDT)
)
trt_end_date <- date_source(
  dataset_name = "adsl",
  date = TRTEDT
)Once the sources are defined, the function
derive_var_extreme_dt() can be used to derive
LSTALVDT:
# impute AE start and end date to first
ae_ext <- ae %>%
  derive_vars_dt(
    dtc = AESTDTC,
    new_vars_prefix = "AEST",
    highest_imputation = "M"
  ) %>%
  derive_vars_dt(
    dtc = AEENDTC,
    new_vars_prefix = "AEEN",
    highest_imputation = "M"
  )
# impute LB date to first
lb_ext <- derive_vars_dt(
  lb,
  dtc = LBDTC,
  new_vars_prefix = "LB",
  highest_imputation = "M"
)
adsl <- adsl %>%
  derive_var_extreme_dt(
    new_var = LSTALVDT,
    ae_start_date, ae_end_date, lb_date, trt_end_date,
    source_datasets = list(ae = ae_ext, adsl = adsl, lb = lb_ext),
    mode = "last"
  )| USUBJID | TRTEDT | DTHDTC | LSTALVDT | 
|---|---|---|---|
| 01-701-1015 | 2014-07-02 | 2014-07-02 | |
| 01-701-1023 | 2012-09-01 | 2012-09-02 | |
| 01-701-1028 | 2014-01-14 | 2014-01-14 | |
| 01-701-1033 | 2014-03-31 | 2014-04-14 | |
| 01-701-1034 | 2014-12-30 | 2014-12-30 | |
| 01-701-1047 | 2013-03-09 | 2013-04-07 | |
| 01-701-1097 | 2014-07-09 | 2014-07-09 | |
| 01-701-1111 | 2012-09-16 | 2012-09-17 | |
| 01-701-1115 | 2013-01-23 | 2013-01-23 | |
| 01-701-1118 | 2014-09-09 | 2014-09-09 | 
Similarly to dthcaus_source(), the traceability
variables can be added by specifying the traceability_vars
argument in date_source().
ae_start_date <- date_source(
  dataset_name = "ae",
  date = AESTDT,
  traceability_vars = vars(LALVDOM = "AE", LALVSEQ = AESEQ, LALVVAR = "AESTDTC")
)
ae_end_date <- date_source(
  dataset_name = "ae",
  date = AEENDT,
  traceability_vars = vars(LALVDOM = "AE", LALVSEQ = AESEQ, LALVVAR = "AEENDTC")
)
lb_date <- date_source(
  dataset_name = "lb",
  date = LBDT,
  filter = !is.na(LBDT),
  traceability_vars = vars(LALVDOM = "LB", LALVSEQ = LBSEQ, LALVVAR = "LBDTC")
)
trt_end_date <- date_source(
  dataset_name = "adsl",
  date = TRTEDTM,
  traceability_vars = vars(LALVDOM = "ADSL", LALVSEQ = NA_integer_, LALVVAR = "TRTEDTM")
)
adsl <- adsl %>%
  select(-LSTALVDT) %>% # created in the previous call
  derive_var_extreme_dt(
    new_var = LSTALVDT,
    ae_start_date, ae_end_date, lb_date, trt_end_date,
    source_datasets = list(ae = ae_ext, adsl = adsl, lb = lb_ext),
    mode = "last"
  )| USUBJID | TRTEDT | DTHDTC | LSTALVDT | LALVDOM | LALVSEQ | LALVVAR | 
|---|---|---|---|---|---|---|
| 01-701-1015 | 2014-07-02 | 2014-07-02 | ADSL | NA | TRTEDTM | |
| 01-701-1023 | 2012-09-01 | 2012-09-02 | LB | 107 | LBDTC | |
| 01-701-1028 | 2014-01-14 | 2014-01-14 | ADSL | NA | TRTEDTM | |
| 01-701-1033 | 2014-03-31 | 2014-04-14 | LB | 107 | LBDTC | |
| 01-701-1034 | 2014-12-30 | 2014-12-30 | ADSL | NA | TRTEDTM | |
| 01-701-1047 | 2013-03-09 | 2013-04-07 | LB | 134 | LBDTC | |
| 01-701-1097 | 2014-07-09 | 2014-07-09 | ADSL | NA | TRTEDTM | |
| 01-701-1111 | 2012-09-16 | 2012-09-17 | LB | 73 | LBDTC | |
| 01-701-1115 | 2013-01-23 | 2013-01-23 | ADSL | NA | TRTEDTM | |
| 01-701-1118 | 2014-09-09 | 2014-09-09 | ADSL | NA | TRTEDTM | 
AGEGR1 or REGION1)Numeric and categorical variables (AGE,
RACE, COUNTRY, etc.) may need to be grouped to
perform the required analysis. {admiral} does not
currently have functionality to assist with all
required groupings. So, the user will often need to create his/her own
function to meet his/her study requirement.
For example, if
AGEGR1 is required to categorize AGE into
<18, 18-64 and >64, orREGION1 is required to categorize COUNTRY
in North America, Rest of the World,the user defined functions would look like the following:
format_agegr1 <- function(var_input) {
  case_when(
    var_input < 18 ~ "<18",
    between(var_input, 18, 64) ~ "18-64",
    var_input > 64 ~ ">64",
    TRUE ~ "Missing"
  )
}
format_region1 <- function(var_input) {
  case_when(
    var_input %in% c("CAN", "USA") ~ "North America",
    !is.na(var_input) ~ "Rest of the World",
    TRUE ~ "Missing"
  )
}These functions are then used in a mutate() statement to
derive the required grouping variables:
adsl <- adsl %>%
  mutate(
    AGEGR1 = format_agegr1(AGE),
    REGION1 = format_region1(COUNTRY)
  )| USUBJID | AGE | SEX | COUNTRY | AGEGR1 | REGION1 | 
|---|---|---|---|---|---|
| 01-701-1015 | 63 | F | USA | 18-64 | North America | 
| 01-701-1023 | 64 | M | USA | 18-64 | North America | 
| 01-701-1028 | 71 | M | USA | >64 | North America | 
| 01-701-1033 | 74 | M | USA | >64 | North America | 
| 01-701-1034 | 77 | F | USA | >64 | North America | 
| 01-701-1047 | 85 | F | USA | >64 | North America | 
| 01-701-1057 | 59 | F | USA | 18-64 | North America | 
| 01-701-1097 | 68 | M | USA | >64 | North America | 
| 01-701-1111 | 81 | F | USA | >64 | North America | 
| 01-701-1115 | 84 | M | USA | >64 | North America | 
SAFFL)Since the populations flags are mainly company/study specific no
dedicated functions are provided, but in most cases they can easily be
derived using derive_var_merged_exist_flag.
An example of an implementation could be:
adsl <- adsl %>%
  derive_var_merged_exist_flag(
    dataset_add = ex,
    by_vars = vars(STUDYID, USUBJID),
    new_var = SAFFL,
    condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO")))
  )| USUBJID | TRTSDT | ARM | ACTARM | SAFFL | 
|---|---|---|---|---|
| 01-701-1015 | 2014-01-02 | Placebo | Placebo | Y | 
| 01-701-1023 | 2012-08-05 | Placebo | Placebo | Y | 
| 01-701-1028 | 2013-07-19 | Xanomeline High Dose | Xanomeline High Dose | Y | 
| 01-701-1033 | 2014-03-18 | Xanomeline Low Dose | Xanomeline Low Dose | Y | 
| 01-701-1034 | 2014-07-01 | Xanomeline High Dose | Xanomeline High Dose | Y | 
| 01-701-1047 | 2013-02-12 | Placebo | Placebo | Y | 
| 01-701-1057 | NA | Screen Failure | Screen Failure | NA | 
| 01-701-1097 | 2014-01-01 | Xanomeline Low Dose | Xanomeline Low Dose | Y | 
| 01-701-1111 | 2012-09-07 | Xanomeline Low Dose | Xanomeline Low Dose | Y | 
| 01-701-1115 | 2012-11-30 | Xanomeline Low Dose | Xanomeline Low Dose | Y | 
The users can add specific code to cover their need for the analysis.
The following functions are helpful for many ADSL derivations:
derive_vars_merged() - Merge Variables from a Dataset
to the Input Datasetderive_var_merged_exist_flag() - Merge an Existence
Flagderive_var_merged_cat() - Merge a Categorization
Variablederive_var_merged_character() - Merge a Character
Variablederive_var_merged_summary() - Merge a Summary
VariableSee also Generic Functions.
Adding labels and attributes for SAS transport files is supported by the following packages:
metacore: establish a common foundation for the use of metadata within an R session.
metatools: enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata.
xportr: functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).
NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.
| ADaM | Sample Code | 
|---|---|
| ADSL | ad_adsl.R |