Update: With the latest RStudio verions getting tables from R into Word is even easier, see my new post on the subject.
Background: Because most journals that I submit to want the documents in Word and not LaTeX, converting my output into Word is essential. I used to rely on converting LaTeX into Word but this was tricky, full of bugs and still needed tweaking at the end. With R Markdown and LibreOffice it’s actually rather smooth sailing, although I must admit that I’m disappointed at how bad Word handles html.
We start with loading the package, and labeling the dataset. The labels and the units are from the Hmisc package:
library(Gmisc, verbose=FALSE) data(mtcars) label(mtcars$mpg) <- "Gas" units(mtcars$mpg) <- "Miles/gal" label(mtcars$wt) <- "Weight" units(mtcars$wt) <- "103 lb" mtcars$am <- factor(mtcars$am, levels=0:1, labels=c("Automatic", "Manual")) label(mtcars$am) <- "Transmission" mtcars$gear <- factor(mtcars$gear) label(mtcars$gear) <- "Gears" # Make up some data for making it slightly more interesting mtcars$col <- factor(sample(c("red", "black", "silver"), size=NROW(mtcars), replace=TRUE)) label(mtcars$col) <- "Car color"
Now we calculate the statistics. The getDescriptionsStatsBy() is a more interesting alternative to just running table(). It can also run simple statistics that often are reported in table 1.
mpg_data <- getDescriptionStatsBy(mtcars$mpg, mtcars$am, html=TRUE) rownames(mpg_data) <- units(mtcars$mpg) wt_data <- getDescriptionStatsBy(mtcars$wt, mtcars$am, html=TRUE) rownames(wt_data) <- units(mtcars$wt) gear_data <- getDescriptionStatsBy(mtcars$gear, mtcars$am, html=TRUE) col_data <- getDescriptionStatsBy(mtcars$col, mtcars$am, html=TRUE)
Next we create the actual table with htmlTable. We can also have an internal reference to the table using the <a href=“#Table1” >, click here. The latex() function that I've used as a template for the parameters (to be able to quickly switch between the two) can feel a little overwhelming:
- x - just the matrix with all the cells
- caption - nothing fancy, just the table caption
- label - this is transferred into an href anchor, <a name=“#label” ></a>
- rowlabel - the contents of the top left cell
- rgroup - the label of the groups, this is the unindented header of each group
- n.rgroup - the number of rows that each group contains, note that this is not the position of the group but the number of elements in them, i.e. sum(n.rgroup) == nrow(x)
- ctable - a formatting option from LaTeX that gives top/bottom border as single lines instead of double.
htmlTable( x = rbind(gear_data, col_data, mpg_data, wt_data), caption = paste("My table 1. All continuous values are reported with", "mean and standard deviation, x̄ (± SD), while categories", "are reported in percentages, no (%)."), label = "Table1", rowlabel = "Variables", rgroup = c(label(gear_data), label(col_data), label(mpg_data), label(wt_data)), n.rgroup = c(NROW(gear_data), NROW(col_data), NROW(mpg_data), NROW(wt_data)), ctable = TRUE)
Below is the table. Note: the table is formatted by this blog CSL, it will look different after running the Rmd document through knitr.
Now install and open in LibreOffice Writer the html document that knitr has created:
Now select the table, copy and paste into word, voila!