Calling an arbitrary list of functions on an object can be a useful data analysis pattern. There are a number of ways to do this in R. The demonstration below will walk through a basic scenario using exec()
and the map_*
family of functions from the purrr
package.
The example will use functions from the twoxtwo
package, which provides an interface for performing epidemiological data analysis with two-by-two tables.
First load twoxtwo
and tidyverse
(for purrr
, dplyr
, and tidyr
):
library(twoxtwo)
library(tidyverse)
Next create a data set to motivate the demonstration. This will be expanded, observation-level data with binary exposure and outcome variables:
dat <-
tribble(~exposed, ~diseased,~n,
TRUE, TRUE, 250,
TRUE, FALSE,250,
FALSE, TRUE, 50,
FALSE, FALSE, 450) %>%
uncount(n)
dat
## # A tibble: 1,000 x 2
## exposed diseased
## <lgl> <lgl>
## 1 TRUE TRUE
## 2 TRUE TRUE
## 3 TRUE TRUE
## 4 TRUE TRUE
## 5 TRUE TRUE
## 6 TRUE TRUE
## 7 TRUE TRUE
## 8 TRUE TRUE
## 9 TRUE TRUE
## 10 TRUE TRUE
## # … with 990 more rows
The twoxtwo
data structure summarizes binary exposures and outcomes as two-by-two counts:
dat %>%
twoxtwo(exposure = exposed, outcome = diseased)
## | | |OUTCOME |OUTCOME |
## |:--------|:-------------|:-------------|:--------------|
## | | |diseased=TRUE |diseased=FALSE |
## |EXPOSURE |exposed=TRUE |250 |250 |
## |EXPOSURE |exposed=FALSE |50 |450 |
One can then use functions on the object to conduct analyses such as computing measures of effect:
dat %>%
twoxtwo(exposure = exposed, outcome = diseased) %>%
odds_ratio()
## # A tibble: 1 x 6
## measure estimate ci_lower ci_upper exposure outcome
## <chr> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Odds Ratio 9 6.40 12.7 exposed::TRUE/FALSE diseased::TRUE/FALSE
dat %>%
twoxtwo(exposure = exposed, outcome = diseased) %>%
risk_ratio()
## # A tibble: 1 x 6
## measure estimate ci_lower ci_upper exposure outcome
## <chr> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Risk Ratio 5 3.79 6.60 exposed::TRUE/FALSE diseased::TRUE/FALSE
dat %>%
twoxtwo(exposure = exposed, outcome = diseased) %>%
risk_diff()
## # A tibble: 1 x 6
## measure estimate ci_lower ci_upper exposure outcome
## <chr> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Risk Differen… 0.4 0.349 0.451 exposed::TRUE/FAL… diseased::TRUE/F…
Rather than passing each of these functions sequentially, it might be preferrable to pass them as a list to be invoked over the object in one pass.
Note that each of the twoxtwo
effect measure functions returns a tibble
with the same column names. The outputs can be stacked on top of one another in a single data frame.
To do so, create a named list of the functions to be passed:
measure_funs <- c(odds_ratio = odds_ratio,
risk_ratio = risk_ratio,
risk_diff = risk_diff)
Next coerce the twoxtwo
to a list:
dat %>%
twoxtwo(., exposure = exposed, outcome = diseased ) %>%
list(.)
## [[1]]
## | | |OUTCOME |OUTCOME |
## |:--------|:-------------|:-------------|:--------------|
## | | |diseased=TRUE |diseased=FALSE |
## |EXPOSURE |exposed=TRUE |250 |250 |
## |EXPOSURE |exposed=FALSE |50 |450 |
The list can then be passed to a map_*
function from purrr
. In this case the desired output is a tibble
, so it is best to use map_df()
. The operation to be mapped will be an anonymous function that performs another map_df()
over the list of analysis functions and calls exec()
on each one:
dat %>%
twoxtwo(., exposure = exposed, outcome = diseased) %>%
list(.) %>%
map_df(~ measure_funs %>% map_df(exec, .x))
## # A tibble: 3 x 6
## measure estimate ci_lower ci_upper exposure outcome
## <chr> <dbl> <dbl> <dbl> <chr> <chr>
## 1 Odds Ratio 9 6.40 12.7 exposed::TRUE/FAL… diseased::TRUE/F…
## 2 Risk Ratio 5 3.79 6.60 exposed::TRUE/FAL… diseased::TRUE/F…
## 3 Risk Differen… 0.4 0.349 0.451 exposed::TRUE/FAL… diseased::TRUE/F…