To support the safety analysis, it is quite common to define specific grouping of events. One of the most common ways is to group events or medications by a specific medical concept such as a Standard MedDRA Queries (SMQs) or WHO-Drug Standardized Drug Groupings (SDGs).
To help with the derivation of these variables, the {admiral} function derive_vars_query() can be used. This function takes as input the dataset (dataset) where the grouping must occur (e.g ADAE) and a dataset containing the required information to perform the derivation of the grouping variables (dataset_queries).
The dataset passed to the dataset_queries argument of the derive_vars_query() function can be created by the create_query_data() function. For SMQs and SDGs company-specific functions for accessing the SMQ and SDG database need to be passed to the create_query_data() function (see the description of the get_smq_fun and get_sdq_fun parameter for details).
This vignette describes the expected structure and content of the dataset passed to the dataset_queries argument in the derive_vars_query() function.
| Variable | Scope | Type | Example Value |
|---|---|---|---|
| VAR_PREFIX | The prefix used to define the grouping variables | Character | “SMQ01” |
| QUERY_NAME | The value provided to the grouping variables name | Character | “Immune-Mediated Guillain-Barre Syndrome” |
| TERM_LEVEL | The variable used to define the grouping. Used in conjunction with TERM_NAME | Character | “AEDECOD” |
| TERM_NAME | A term used to define the grouping. Used in conjunction with TERM_LEVEL | Character | “GUILLAIN-BARRE SYNDROME” |
| TERM_ID | A code used to define the grouping. Used in conjunction with TERM_LEVEL | Integer | 10018767 |
| QUERY_ID | Id number of the query. This could be a SMQ identifier | Integer | 20000131 |
| QUERY_SCOPE | For SMQs, scope (Broad/Narrow) of the query | Character | BROAD, NARROW, NA |
| QUERY_SCOPE_NUM | For SMQs, scope (Broad/Narrow) of the query | Integer | 1, 2, NA |
Bold variables are required in dataset_queries: an error is issued if any of these variables is missing. Other variables are optional.
Each row must be unique within the dataset.
As described above, the variables VAR_PREFIX, QUERY_NAME, TERM_LEVEL, TERM_NAME and TERM_ID are required. The combination of these variables will allow the creation of the grouping variable.
VAR_PREFIX must be a character string starting with 2 or 3 letters, followed by a 2-digits number (e.g. “CQ01”).
QUERY_NAME must be a non missing character string and it must be unique within VAR_PREFIX.
TERM_LEVEL must be a non missing character string.
Each value in TERM_LEVEL represents a variable from dataset used to define the grouping variables (e.g. AEDECOD,AEBODSYS, AELLTCD).
The function derive_vars_query() will check that each value given in TERM_LEVEL has a corresponding variable in the input dataset and issue an error otherwise.
Different TERM_LEVEL variables may be specified within a VAR_PREFIX.
TERM_NAME must be a character string. This must be populated if TERM_ID is missing.
TERM_ID must be an integer. This must be populated if TERM_NAME is missing.
VAR_PREFIX will be used to create the grouping variable appending the suffix “NAM”. This variable will now be referred to as ABCzzNAM: the name of the grouping variable.
E.g. VAR_PREFIX == "SMQ01" will create the SMQ01NAM variable.
For each VAR_PREFIX, a new ABCzzNAM variable is created in dataset.
QUERY_NAME will be used to populate the corresponding ABCzzNAM variable.
TERM_LEVEL will be used to identify the variables from dataset used to perform the grouping (e.g. AEDECOD,AEBODSYS, AELLTCD).
TERM_NAME (for character variables), TERM_ID (for numeric variables) will be used to identify the records meeting the criteria in dataset based on the variable defined in TERM_LEVEL.
Result:
For each record in dataset, where the variable defined by TERM_LEVEL match a term from the TERM_NAME (for character variables) or TERM_ID (for numeric variables) in the datasets_queries, ABCzzNAM is populated with QUERY_NAME.
Note: The type (numeric or character) of the variable defined in TERM_LEVEL is checked in dataset. If the variable is a character variable (e.g. AEDECOD), it is expected that TERM_NAME is populated, if it is a numeric variable (e.g. AEBDSYCD), it is expected that TERM_ID is populated, otherwise an error is issued.
In this example, one standard MedDRA query (VAR_PREFIX = "SMQ01") and one customized query (VAR_PREFIX = "CQ02") are defined to analyze the adverse events.
The standard MedDRA query variable SMQ01NAM [VAR_PREFIX] will be populated with “Standard Query 1” [QUERY_NAME] if any preferred term (AEDECOD) [TERM_LEVEL] in dataset is equal to “AE1” or “AE2” [TERM_NAME]
The customized query (CQ02NAM) [VAR_PREFIX] will be populated with “Query 2” [QUERY_NAME] if any Low Level Term Code (AELLTCD) [TERM_LEVEL] in dataset is equal to 10 [TERM_ID] or any preferred term (AEDECOD) [TERM_LEVEL] in dataset is equal to “AE4” [TERM_NAME].
ds_query)| VAR_PREFIX | QUERY_NAME | TERM_LEVEL | TERM_NAME | TERM_ID | |
|---|---|---|---|---|---|
| SMQ01 | Standard Query 1 | AEDECOD | AE1 | ||
| SMQ01 | Standard Query 1 | AEDECOD | AE2 | ||
| CQ02 | Query 2 | AELLTCD | 10 | ||
| CQ02 | Query 2 | AEDECOD | AE4 |
ae)| USUBJID | AEDECOD | AELLTCD |
|---|---|---|
| 0001 | AE1 | 101 |
| 0001 | AE3 | 10 |
| 0001 | AE4 | 120 |
| 0001 | AE5 | 130 |
Generated by calling derive_vars_query(dataset = ae, dataset_queries = ds_query).
| USUBJID | AEDECOD | AELLTCD | SMQ01NAM | CQ02NAM |
|---|---|---|---|---|
| 0001 | AE1 | 101 | Standard Query 1 | |
| 0001 | AE3 | 10 | Query 2 | |
| 0001 | AE4 | 120 | Query 2 | |
| 0001 | AE5 | 130 |
Subject 0001 has one event meeting the Standard Query 1 criteria (AEDECOD = "AE1") and two events meeting the customized query (AELLTCD = 10 and AEDECOD = "AE4").
When standardized MedDRA Queries are added to the dataset, it is expected that the name of the query (ABCzzNAM) is populated along with its number code (ABCzzCD), and its Broad or Narrow scope (ABCzzSC).
The following variables can be added to queries_datset to derive this information.
QUERY_ID must be an integer.
QUERY_SCOPE must be a character string. Possible values are: “BROAD”, “NARROW” or NA.
QUERY_SCOPE_NUM must be an integer. Possible values are: 1, 2 or NA.
QUERY_ID, QUERY_SCOPE and QUERY_SCOPE_NUM will be used in the same way as QUERY_NAME (see here) and will help in the creation of the ABCzzCD, ABCzzSC and ABCzzSCN variables.These variables are optional and if not populated in dataset_queries, the corresponding output variable will not be created:
| VAR_PREFIX | QUERY_NAME | QUERY_ID | QUERY_SCOPE | QUERY_SCOPE_NUM | Variables created |
|---|---|---|---|---|---|
| SMQ01 | Query 1 | XXXXXXXX | NARROW | 2 | SMQ01NAM, SMQ01CD, SMQ01SC, SMQ01SCN |
| SMQ02 | Query 2 | XXXXXXXX | BROAD | SMQ02NAM, SMQ02CD, SMQ02SC |
|
| SMQ03 | Query 3 | XXXXXXXX | 1 | SMQ03NAM, SMQ03CD, SMQ03SCN |
|
| SMQ04 | Query 4 | XXXXXXXX | SMQ04NAM, SMQ04CD |
||
| SMQ05 | Query 5 | SMQ05NAM |