Since initial publishing my forestplot package, dplyr and tidyverse have become evermore dominant in how we think about data and data management. I have therefore just published a 2.0 version that uses many of the awesome select & group features that make tidyverse such a compelling tool.
Basic change
Below is how you can work with some standard meta-analysis data. I assume that you have a base dataset with all your studies and then one additional with the summary data. We can easily append rows for header or empty rows without all the manual fuzz the previous solution required:
# Cochrane data base_data <- tibble(mean = c(0.578, 0.165, 0.246, 0.700, 0.348, 0.139, 1.017), lower = c(0.372, 0.018, 0.072, 0.333, 0.083, 0.016, 0.365), upper = c(0.898, 1.517, 0.833, 1.474, 1.455, 1.209, 2.831), study = c("Auckland", "Block", "Doran", "Gamsu", "Morrison", "Papageorgiou", "Tauesch"), deaths_steroid = c("36", "1", "4", "14", "3", "1", "8"), deaths_placebo = c("60", "5", "11", "20", "7", "7", "10"), OR = c("0.58", "0.16", "0.25", "0.70", "0.35", "0.14", "1.02")) summary <- tibble(mean = 0.531, lower = 0.386, upper = 0.731, study = "Summary", OR = "0.53", summary = TRUE) header <- tibble(study = c("", "Study"), deaths_steroid = c("Deaths", "(steroid)"), deaths_placebo = c("Deaths", "(placebo)"), OR = c("", "OR"), summary = TRUE) empty_row <- tibble(mean = NA_real_) cochrane_output_df <- bind_rows(header, base_data, empty_row, summary) cochrane_output_df %>% forestplot(labeltext = c(study, deaths_steroid, deaths_placebo, OR), is.summary = summary, clip = c(0.1, 2.5), hrzl_lines = list("3" = gpar(lty = 2), "11" = gpar(lwd = 1, columns = 1:4, col = "#000044")), xlog = TRUE, col = fpColors(box = "royalblue", line = "darkblue", summary = "royalblue", hrz_lines = "#444444"))
Just as a comparison you can see the old code for achieving the same thing:
cochrane_from_rmeta <- structure(list(mean = c(NA, NA, 0.578, 0.165, 0.246, 0.700, 0.348, 0.139, 1.017, NA, 0.531), lower = c(NA, NA, 0.372, 0.018, 0.072, 0.333, 0.083, 0.016, 0.365, NA, 0.386), upper = c(NA, NA, 0.898, 1.517, 0.833, 1.474, 1.455, 1.209, 2.831, NA, 0.731)), .Names = c("mean", "lower", "upper"), row.names = c(NA, -11L), class = "data.frame") tabletext <- cbind(c("", "Study", "Auckland", "Block", "Doran", "Gamsu", "Morrison", "Papageorgiou", "Tauesch", NA, "Summary"), c("Deaths", "(steroid)", "36", "1", "4", "14", "3", "1", "8", NA, NA), c("Deaths", "(placebo)", "60", "5", "11", "20", "7", "7", "10", NA, NA), c("", "OR", "0.58", "0.16", "0.25", "0.70", "0.35", "0.14", "1.02", NA, "0.53")) forestplot(tabletext, hrzl_lines = list("3" = gpar(lty = 2), "11" = gpar(lwd = 1, columns = 1:4, col = "#000044")), cochrane_from_rmeta, is.summary = c(TRUE,TRUE,rep(FALSE,8),TRUE), clip = c(0.1,2.5), xlog = TRUE, col = fpColors(box = "royalblue", line = "darkblue", summary = "royalblue", hrz_lines = "#444444"))
The difference perhaps doesn’t seem huge here but the ability to work directly with data.frame
objects instead of matrices or lists makes life much easier.
Grouped coefficients
One of the features that was rather unique when I started working on this package was the ability to have multiple bands per row. This allows us to package data more densely and thus makes it easier to compare results.
data(dfHRQoL) dfHRQoL %>% group_by(group) %>% forestplot(legend_args = fpLegend(pos = list(x = .25, y = 0.95), gp = gpar(col = "#CCCCCC", fill = "#F9F9F9")), fn.ci_norm = c(fpDrawNormalCI, fpDrawCircleCI), boxsize = .25, # We set the box size to better visualize the type line.margin = .2, # We need to add this to avoid crowding clip = c(-.125, 0.075), col = fpColors(box = c("blue", "darkred")), xlab = "EQ-5D index")
Do I have to change everything?
The major version number change is due to the API change, although most of you should not have to change anything with this update. The core things to look for are:
- In loops/functions, you have to manually call print (or plot) on the returned object. This as the 2.0 returns an object that is only plotted once print is called on it – the same way ggplot works. This should be automatic unless you’ve encapsulated the forestplot function.
- If you have provided a data.frame as your first input argument then there may be issues as this will result that the
tidyselect
syntax is used.
This is absolutely outstanding! Is there a convenient way of including subscripts in the header (e.g., expression(paste(“HR” [“Adj”], “(95% CI)”)) )?
Thanks, all you need to do is supply as you suggest a plotmath expression for that position in the labeltext argument. Check out the vignette. Each column should be a separate list for this to work, e.g. for the first labeltext column just do:
labeltext[[1]][[1]] <- expression(paste(plain(HR) [ plain(Adj) ]))
Hope this helps