Creating ADSL

Introduction

This article describes creating an ADSL ADaM specific to Vaccines. Examples are currently presented and tested using DM, EX SDTM domains. However, other domains could be used.

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Programming Flow

Read in Data

To start, all data frames needed for the creation of ADSL should be read into the environment. This will be a company specific process. Some of the data frames needed may be DM, EX.

library(admiral)
library(admiralvaccine)
library(pharmaversesdtm)
library(dplyr, warn.conflicts = FALSE)
library(lubridate)
library(stringr)
library(admiraldev)

data("dm_vaccine")
data("ex_vaccine")

dm <- convert_blanks_to_na(dm_vaccine)
ex <- convert_blanks_to_na(ex_vaccine)

The DM domain is used as the basis for ADSL:

adsl <- dm %>%
  select(-DOMAIN)
USUBJID RFSTDTC COUNTRY AGE SEX RACE ETHNIC ARM ACTARM
ABC-1001 2021-11-03T10:50:00 USA 74 F WHITE NOT HISPANIC OR LATINO VACCINE A VACCINE B VACCINE A VACCINE B
ABC-1002 2021-10-07T12:48:00 USA 70 F BLACK OR AFRICAN AMERICAN HISPANIC OR LATINO VACCINE A VACCINE B VACCINE A VACCINE B

Derive Period, Subperiod, and Phase Variables (e.g. APxxSDT, APxxEDT, …)

The {admiral} core package has separate functions to handle period variables since these variables are study specific.

See the “Visit and Period Variables” vignette for more information.

If the variables are not derived based on a period reference dataset, they may be derived at a later point of the flow. For example, phases like “Treatment Phase” and “Follow up” could be derived based on treatment start and end date.

Derive Treatment Variables (TRT0xP, TRT0xA)

The mapping of the treatment variables is left to the ADaM programmer. An example mapping for a study without periods may be:

adsl <- dm %>%
  mutate(
    TRT01P = substring(ARM, 1, 9),
    TRT02P = substring(ARM, 11, 100)
  ) %>%
  derive_vars_merged(
    dataset_add = ex,
    filter_add = EXLNKGRP == "VACCINATION 1",
    new_vars = exprs(TRT01A = EXTRT),
    by_vars = exprs(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex,
    filter_add = EXLNKGRP == "VACCINATION 2",
    new_vars = exprs(TRT02A = EXTRT),
    by_vars = exprs(STUDYID, USUBJID)
  )

Derive/Impute Numeric Treatment Date/Time and Duration (TRTSDTM, TRTEDTM, TRTDURD)

The function derive_vars_merged() can be used to derive the treatment start and end date/times using the ex domain. A pre-processing step for ex is required to convert the variable EXSTDTC and EXSTDTC to datetime variables and impute missing date or time components. Conversion and imputation is done by derive_vars_dtm().

Example calls:

# impute start and end time of exposure to first and last respectively, do not impute date
ex_ext <- ex %>%
  derive_vars_dtm(
    dtc = EXSTDTC,
    new_vars_prefix = "EXST"
  ) %>%
  derive_vars_dtm(
    dtc = EXENDTC,
    new_vars_prefix = "EXEN"
  )
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "VACCINE"))) &
      !is.na(EXSTDTM),
    new_vars = exprs(TRTSDTM = EXSTDTM),
    order = exprs(EXSTDTM, EXSEQ),
    mode = "first",
    by_vars = exprs(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "VACCINE"))) & !is.na(EXENDTM),
    new_vars = exprs(TRTEDTM = EXENDTM),
    order = exprs(EXENDTM, EXSEQ),
    mode = "last",
    by_vars = exprs(STUDYID, USUBJID)
  )

This call returns the original data frame with the column TRTSDTM, TRTSTMF, TRTEDTM, and TRTETMF added. Exposure observations with incomplete date and zero doses of non placebo treatments are ignored. Missing time parts are imputed as first or last for start and end date respectively.

The datetime variables returned can be converted to dates using the derive_vars_dtm_to_dt() function.

adsl <- adsl %>%
  derive_vars_dtm_to_dt(source_vars = exprs(TRTSDTM, TRTEDTM))

Now, that TRTSDT and TRTEDT are derived, the function derive_var_trtdurd() can be used to calculate the Treatment duration (TRTDURD).

adsl <- adsl %>%
  derive_var_trtdurd()
USUBJID RFSTDTC TRTSDTM TRTSDT TRTEDTM TRTEDT TRTDURD
ABC-1001 2021-11-03T10:50:00 2021-11-03 10:50:00 2021-11-03 2021-12-30 09:10:00 2021-12-30 58
ABC-1002 2021-10-07T12:48:00 2021-10-07 12:48:00 2021-10-07 2021-12-16 12:41:00 2021-12-16 71

Population Flags (e.g. SAFFL)

Since the populations flags are mainly company/study specific no dedicated functions are provided, but in most cases they can easily be derived using derive_var_merged_exist_flag().

An example of an implementation could be:

adsl <- derive_var_merged_exist_flag(
  dataset = adsl,
  dataset_add = ex,
  by_vars = exprs(STUDYID, USUBJID),
  new_var = SAFFL,
  condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "VACCINE")))
) %>%
  mutate(
    PPROTFL = "Y"
  )
USUBJID TRTSDT ARM ACTARM SAFFL PPROTFL
ABC-1001 2021-11-03 VACCINE A VACCINE B VACCINE A VACCINE B Y Y
ABC-1002 2021-10-07 VACCINE A VACCINE B VACCINE A VACCINE B Y Y

Derive Vaccination Date Variables

In this step, we will create a vaccination date variables from EX domain. The function derive_vars_vaxdt() returns the variables VAX01DT,VAX02DT… added to the adsl dataset based on number of vaccinations.

If there are multiple vaccinations for a visit per subject, a warning will be provided and only first observation will be filtered based on the variable order specified on the order argument. In this case, a user needs to select the by_vars appropriately.

adsl <- derive_vars_vaxdt(
  dataset = ex,
  dataset_adsl = adsl,
  by_vars = exprs(USUBJID, VISITNUM),
  order = exprs(USUBJID, VISITNUM, VISIT, EXSTDTC)
)
USUBJID VAX01DT VAX02DT
ABC-1001 2021-11-03 2021-12-30
ABC-1002 2021-10-07 2021-12-16

This call would return the input dataset with columns VAX01DT, VAX02DT added.

Create Period Variables (Study Specific)

In this step this we will create period variables which will be study specific, User can change the logic as per their study requirement.

adsl <- adsl %>%
  mutate(
    AP01SDT = VAX01DT,
    AP01EDT = if_else(!is.na(VAX02DT), VAX02DT - 1, as.Date(RFPENDTC)),
    AP02SDT = if_else(!is.na(VAX02DT), VAX02DT, NA_Date_),
    AP02EDT = if_else(!is.na(AP02SDT), as.Date(RFPENDTC), NA_Date_)
  )
USUBJID AP01SDT AP01EDT AP02SDT AP02EDT
ABC-1001 2021-11-03 2021-12-29 2021-12-30 2022-04-27
ABC-1002 2021-10-07 2021-12-15 2021-12-16 2022-06-14

This call would return the input dataset with columns AP01SDT, AP01EDT, AP02SDT, AP02EDT added.

Derive Other Variables

The users can add specific code to cover their need for the analysis.

The following functions are helpful for many ADSL derivations:

See also Generic Functions.

Add Labels and Attributes

Adding labels and attributes for SAS transport files is supported by the following packages:

NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.

Example Script

ADaM Sample Code
ADSL ad_adsl.R