Mapping and Executing a List of Functions in R

Jul 17, 2021 3 min read notes

Calling an arbitrary list of functions on an object can be a useful data analysis pattern. There are a number of ways to do this in R. The demonstration below will walk through a basic scenario using exec() and the map_* family of functions from the purrr package.

The example will use functions from the twoxtwo package, which provides an interface for performing epidemiological data analysis with two-by-two tables.

First load twoxtwo and tidyverse (for purrr, dplyr, and tidyr):

library(twoxtwo)
library(tidyverse)

Next create a data set to motivate the demonstration. This will be expanded, observation-level data with binary exposure and outcome variables:

dat <-
  tribble(~exposed, ~diseased,~n,
          TRUE, TRUE, 250,
          TRUE, FALSE,250,
          FALSE, TRUE, 50,
          FALSE, FALSE, 450) %>%
  uncount(n)

dat

## # A tibble: 1,000 x 2
##    exposed diseased
##    <lgl>   <lgl>   
##  1 TRUE    TRUE    
##  2 TRUE    TRUE    
##  3 TRUE    TRUE    
##  4 TRUE    TRUE    
##  5 TRUE    TRUE    
##  6 TRUE    TRUE    
##  7 TRUE    TRUE    
##  8 TRUE    TRUE    
##  9 TRUE    TRUE    
## 10 TRUE    TRUE    
## # … with 990 more rows

The twoxtwo data structure summarizes binary exposures and outcomes as two-by-two counts:

dat %>%
  twoxtwo(exposure = exposed, outcome = diseased)

## |         |              |OUTCOME       |OUTCOME        |
## |:--------|:-------------|:-------------|:--------------|
## |         |              |diseased=TRUE |diseased=FALSE |
## |EXPOSURE |exposed=TRUE  |250           |250            |
## |EXPOSURE |exposed=FALSE |50            |450            |

One can then use functions on the object to conduct analyses such as computing measures of effect:

dat %>%
  twoxtwo(exposure = exposed, outcome = diseased) %>%
  odds_ratio()

## # A tibble: 1 x 6
##   measure    estimate ci_lower ci_upper exposure            outcome             
##   <chr>         <dbl>    <dbl>    <dbl> <chr>               <chr>               
## 1 Odds Ratio        9     6.40     12.7 exposed::TRUE/FALSE diseased::TRUE/FALSE

dat %>%
  twoxtwo(exposure = exposed, outcome = diseased) %>%
  risk_ratio()

## # A tibble: 1 x 6
##   measure    estimate ci_lower ci_upper exposure            outcome             
##   <chr>         <dbl>    <dbl>    <dbl> <chr>               <chr>               
## 1 Risk Ratio        5     3.79     6.60 exposed::TRUE/FALSE diseased::TRUE/FALSE

dat %>%
  twoxtwo(exposure = exposed, outcome = diseased) %>%
  risk_diff()

## # A tibble: 1 x 6
##   measure        estimate ci_lower ci_upper exposure           outcome          
##   <chr>             <dbl>    <dbl>    <dbl> <chr>              <chr>            
## 1 Risk Differen…      0.4    0.349    0.451 exposed::TRUE/FAL… diseased::TRUE/F…

Rather than passing each of these functions sequentially, it might be preferrable to pass them as a list to be invoked over the object in one pass.

Note that each of the twoxtwo effect measure functions returns a tibble with the same column names. The outputs can be stacked on top of one another in a single data frame.

To do so, create a named list of the functions to be passed:

measure_funs <- c(odds_ratio = odds_ratio, 
                  risk_ratio = risk_ratio, 
                  risk_diff = risk_diff)

Next coerce the twoxtwo to a list:

dat %>%
  twoxtwo(., exposure = exposed, outcome = diseased ) %>%
  list(.)

## [[1]]
## |         |              |OUTCOME       |OUTCOME        |
## |:--------|:-------------|:-------------|:--------------|
## |         |              |diseased=TRUE |diseased=FALSE |
## |EXPOSURE |exposed=TRUE  |250           |250            |
## |EXPOSURE |exposed=FALSE |50            |450            |

The list can then be passed to a map_* function from purrr. In this case the desired output is a tibble, so it is best to use map_df(). The operation to be mapped will be an anonymous function that performs another map_df() over the list of analysis functions and calls exec() on each one:

dat %>%
  twoxtwo(., exposure = exposed, outcome = diseased) %>%
  list(.) %>%
  map_df(~ measure_funs %>% map_df(exec, .x))

## # A tibble: 3 x 6
##   measure        estimate ci_lower ci_upper exposure           outcome          
##   <chr>             <dbl>    <dbl>    <dbl> <chr>              <chr>            
## 1 Odds Ratio          9      6.40    12.7   exposed::TRUE/FAL… diseased::TRUE/F…
## 2 Risk Ratio          5      3.79     6.60  exposed::TRUE/FAL… diseased::TRUE/F…
## 3 Risk Differen…      0.4    0.349    0.451 exposed::TRUE/FAL… diseased::TRUE/F…

R purrr

Mapping and Executing a List of Functions in R

Related