For this example we’ll use the Eunomia synthetic data from the CDMConnector package.
con <- DBI::dbConnect(duckdb::duckdb(), dbdir = CDMConnector::eunomia_dir())
cdm <- CDMConnector::cdm_from_con(con, cdm_schema = "main", 
                    write_schema = c(prefix = "my_study_", schema = "main"))Let’s start by creating two drug cohorts, one for users of diclofenac and another for users of acetaminophen.
cdm$medications <- conceptCohort(cdm = cdm, 
                                 conceptSet = list("diclofenac" = 1124300,
                                                   "acetaminophen" = 1127433), 
                                 name = "medications")
settings(cdm$medications)
#> # A tibble: 2 × 4
#>   cohort_definition_id cohort_name   cdm_version vocabulary_version
#>                  <int> <chr>         <chr>       <chr>             
#> 1                    1 acetaminophen 5.3         v5.0 18-JAN-19    
#> 2                    2 diclofenac    5.3         v5.0 18-JAN-19
cohortCount(cdm$medications)
#> # A tibble: 2 × 3
#>   cohort_definition_id number_records number_subjects
#>                  <int>          <int>           <int>
#> 1                    1           9365            2580
#> 2                    2            830             830Cohort 1 contains users of acetaminophen and has 2580 subjects and 9365 records. Cohort 2 contains users of diclofenac and has 830 subjects and 830 records.
Individuals can contribute multiple records per cohort. However now
we’ll keep only their earliest cohort entry of the remaining records
using requireIsFirstEntry() from CohortConstructor.
cdm$medications <- cdm$medications %>% 
  requireIsFirstEntry()
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)The flow chart above illustrates changes to cohort 1 (acetaminophen users) when restricted to only the first record for each individual. While the number of individuals remains unchanged, 6,785 records are excluded.
We can also choose a specific range of records using
requireIsEntry() from CohortConstructor.
cdm$medications <- cdm$medications %>%
  requireIsEntry(c(1,5)) 
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)The flow chart above illustrates the changes to cohort 1 when restricted to only the first five records for each individual. While the number of individuals remains unchanged, 6,785 records are excluded.
It is also possible to only include the last record for each
individual using requireIsLastEntry() from
CohortConstructor.
cdm$medications <- cdm$medications %>% 
  requireIsLastEntry()
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)The flow chart above illustrates changes to cohort 1 when restricted to only the last record for each individual. While the number of individuals remains unchanged, 6,785 records are excluded.
Individuals may contribute multiple records over extended periods. We
can define the study’s start and end dates, filtering out records that
fall outside the specified date range using the
requireInDateRang function from CohortConstructor.
cdm$medications <- cdm$medications %>% 
  requireInDateRange(dateRange = as.Date(c("2010-01-01", "2015-01-01")))
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)The flow chart above illustrates the changes to cohort 1 when restricted to a specified date range. 1,948 individuals and 8,660 records are excluded.
Some studies might require a minimum cohort size. We can define the
minimum size, filtering out records that are smaller than required,
using the requireMinCohortCount function from
CohortConstructor.
cdm$medications <- cdm$medications %>% 
  requireMinCohortCount(minCohortCount = 1000)
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)Cohort 1 includes 2,580 individuals, so none were excluded due to the minimum cohort size restriction of 1,000.
Multiple restrictions can be applied to a cohort, however care needs to be taken that the restrictions are placed in the correct order. For example, it is recommended to apply the minimum size restriction last.
cdm$medications <- cdm$medications %>% 
  requireIsFirstEntry() %>%
  requireInDateRange(dateRange = as.Date(c("2010-01-01", "2016-01-01")))
summary_attrition <- summariseCohortAttrition(cdm$medications)
plotCohortAttrition(summary_attrition, cohortId = 1)The flow chart above illustrates the changes to cohort 1 when restricted to only include the first record of each individual over a specified date range. 2,529 individuals and 9,314 records are excluded.