Determines whether the required data elements in a data standard are found in a given data frame

evaluateStandard(data, meta, domain, standard)

Arguments

data

A data frame in which to detect the data standard

meta

the metadata containing the data standards.

domain

the domain to evaluate - should match a value of meta$domain

standard

standard to evaluate

Value

a list describing to what degree the data set matches the data standard. The "match" property describes compliance with the standard as "full", "partial" or "none". The "checks" property is a list of the data elements expected for the standard and whether they are "valid" in the given data set. "total_checks", "valid_checks" and "invalid_checks" provide counts of the specified checks. "match_percent" is calculated as valid_checks/total_checks. "mapping" is a data frame describing the detected standard for each "text_key" in the provided metadata. Columns are "text_key", "current" containing the name of the matched column or field value in the data and "match" a boolean indicating whether the data matches the standard.

Examples

# Match is TRUE
evaluateStandard(
 data=safetyData::adam_adlbc, 
 meta=safetyCharts::meta_labs, 
 domain="labs", 
 standard="adam"
) 
#> $standard
#> [1] "adam"
#> 
#> $mapping
#> # A tibble: 8 × 3
#> # Rowwise: 
#>   text_key        current  valid
#>   <chr>           <chr>    <lgl>
#> 1 id_col          USUBJID  TRUE 
#> 2 value_col       AVAL     TRUE 
#> 3 measure_col     PARAM    TRUE 
#> 4 normal_col_low  A1LO     TRUE 
#> 5 normal_col_high A1HI     TRUE 
#> 6 studyday_col    ADY      TRUE 
#> 7 visit_col       VISIT    TRUE 
#> 8 visitn_col      VISITNUM TRUE 
#> 
#> $total_count
#> [1] 8
#> 
#> $valid_count
#> [1] 8
#> 
#> $invalid_count
#> [1] 0
#> 
#> $match_percent
#> [1] 1
#> 
#> $match
#> [1] "full"
#> 
#> $label
#> [1] "ADaM"
#> 

# Match is FALSE
evaluateStandard(
 data=safetyData::adam_adlbc, 
 meta=safetyCharts::meta_labs, 
 domain="labs", 
 standard="sdtm"
) 
#> $standard
#> [1] "sdtm"
#> 
#> $mapping
#> # A tibble: 9 × 3
#> # Rowwise: 
#>   text_key        current  valid
#>   <chr>           <chr>    <lgl>
#> 1 id_col          USUBJID  TRUE 
#> 2 value_col       LBSTRESN TRUE 
#> 3 measure_col     NA       FALSE
#> 4 normal_col_low  NA       FALSE
#> 5 normal_col_high NA       FALSE
#> 6 studyday_col    NA       FALSE
#> 7 visit_col       VISIT    TRUE 
#> 8 visitn_col      VISITNUM TRUE 
#> 9 unit_col        NA       FALSE
#> 
#> $total_count
#> [1] 9
#> 
#> $valid_count
#> [1] 4
#> 
#> $invalid_count
#> [1] 5
#> 
#> $match_percent
#> [1] 0.4444444
#> 
#> $match
#> [1] "partial"
#> 
#> $label
#> [1] "Partial SDTM"
#> 
#> $details
#> [1] "(4/9 cols/fields matched)"
#>