Tables from R into Word

A good looking table matters!

A good looking table matters!

This tutorial is on how to create a neat table in Word by combining knitr and R Markdown. I’ll be using my own function, htmlTable, from the Gmisc package.

Update: With the latest RStudio verions getting tables from R into Word is even easier, see my new post on the subject.

Background: Because most journals that I submit to want the documents in Word and not LaTeX, converting my output into Word is essential. I used to rely on converting LaTeX into Word but this was tricky, full of bugs and still needed tweaking at the end. With R Markdown and LibreOffice it’s actually rather smooth sailing, although I must admit that I’m disappointed at how bad Word handles html.

The tutorial

We start with loading the package, and labeling the dataset. The labels and the units are from the Hmisc package:

library(Gmisc, verbose=FALSE)


label(mtcars$mpg) <- "Gas"
units(mtcars$mpg) <- "Miles/gal"

label(mtcars$wt) <- "Weight"
units(mtcars$wt) <- "103 lb"

mtcars$am <- factor(mtcars$am, 
                    labels=c("Automatic", "Manual"))
label(mtcars$am) <- "Transmission"

mtcars$gear <- factor(mtcars$gear)
label(mtcars$gear) <- "Gears"

# Make up some data for making it slightly more interesting
mtcars$col <- factor(sample(c("red", "black", "silver"), 
label(mtcars$col) <- "Car color"

Now we calculate the statistics. The getDescriptionsStatsBy() is a more interesting alternative to just running table(). It can also run simple statistics that often are reported in table 1.

mpg_data <- getDescriptionStatsBy(mtcars$mpg, mtcars$am, html=TRUE)
rownames(mpg_data) <- units(mtcars$mpg)
wt_data <- getDescriptionStatsBy(mtcars$wt, mtcars$am, html=TRUE)
rownames(wt_data) <- units(mtcars$wt)

gear_data <- getDescriptionStatsBy(mtcars$gear, mtcars$am, html=TRUE)
col_data <- getDescriptionStatsBy(mtcars$col, mtcars$am, html=TRUE)

Next we create the actual table with htmlTable. We can also have an internal reference to the table using the <a href=“#Table1” >, click here. The latex() function that I've used as a template for the parameters (to be able to quickly switch between the two) can feel a little overwhelming:

  • x - just the matrix with all the cells
  • caption - nothing fancy, just the table caption
  • label - this is transferred into an href anchor, <a name=“#label” ></a>
  • rowlabel - the contents of the top left cell
  • rgroup - the label of the groups, this is the unindented header of each group
  • n.rgroup - the number of rows that each group contains, note that this is not the position of the group but the number of elements in them, i.e. sum(n.rgroup) == nrow(x)
  • ctable - a formatting option from LaTeX that gives top/bottom border as single lines instead of double.
  x        = rbind(gear_data, col_data, mpg_data, wt_data),
  caption  = paste("My table 1. All continuous values are reported with",
                   "mean and standard deviation, x̄ (± SD), while categories",
                   "are reported in percentages, no (%)."),
  label    = "Table1",
  rowlabel = "Variables",
  rgroup   = c(label(gear_data),
  n.rgroup = c(NROW(gear_data),
  ctable   = TRUE)

Below is the table. Note: the table is formatted by this blog CSL, it will look different after running the Rmd document through knitr.

My table 1. All continuous values are reported with mean and standard deviation, x̄ (± SD), while categories are reported in percentages, no (%).
  315 (78.9 %)0 (0.0 %)
  44 (21.1 %)8 (61.5 %)
  50 (0.0 %)5 (38.5 %)
Car color
  black6 (31.6 %)4 (30.8 %)
  red7 (36.8 %)3 (23.1 %)
  silver6 (31.6 %)6 (46.2 %)
  Miles/gal17.1 (± 3.8)24.4 (± 6.2)
  103 lb3.8 (± 0.8)2.4 (± 0.6)

Now install and open in LibreOffice Writer the html document that knitr has created:

The table looks actually a little funny in Writer but don't worry, it'll be great!

The table looks actually a little funny in Writer but don't worry, it'll be great!

Now select the table, copy and paste into word, voila!

Now this looks better

Now this looks better!

18 thoughts on “Tables from R into Word

    • Thanks! Interesting post, it never stops amazing me how much tinkering one can do with a simple table. I’ll consider adding to the code options of super cgroups (not sure what to call them) and for styling each row. I just hope that people don’t find the options overwhelming. I’ve tried to document as much as I can but I know from my own experience that reading manuals is not that exciting…

  1. I’ve copy and pasted your entire code into R Markdown and used Knitr. I get the following output:

    Reproducing example



    15 (78.9 %)
    0 (0.0 %)

    4 (21.1 %)
    8 (61.5 %)

    0 (0.0 %)
    5 (38.5 %)…

    I feel like I’m making a very silly and obvious mistake here, but I can’t for the life of me understand what it is. Your example looks amazing and I hope I can use it to create my own tables if I only figure out what’s going wrong. Any suggestions? Thanks.

  2. Hi, I’ve been using htmlTables for a while now and they are great…. but have you found a way to get them directly into word or pdf with pandoc or knitr, without copy-pasting from a compiled html file?

    • Strange, works fine here. What errors do you get, what system are you using and how have you tried to install the package? I’ve uploaded a new version, although the previous should work OK.

  3. Hi, great package.

    How would you get the column header “Transmission” to appear above “Automatic” and “Manual” in your htmlTable ?

  4. I like your package very much. Is it possible to allow more flexibility by allowing multiple levels of headings? The current “cgroup” and ‘n.cgroup’ only take a vector, it will be nice to allow a matrix so several layers of (nested) headings can be displayed. Thanks!

  5. Is it possible to have data in the same row as the row group header? I want a column for p-values, but when it is a multivariate test, there is only one p-value per row group. Thanks!

    • This is currently not supported but it’s easy to work around the problem by simply adding the rowg roup header manually and all the subelements with “&nbsp;&nbsp;” (two non-breaking spaces) before the names of the sub-elements of the row in order to attain the same look and feel.

  6. I am using describeFactors() and the rownames are quite long. I would like to wrap the long rownames. Is this possible? Another question is how to manually put horizontal and vertical lines in the table?

  7. Thanks for this package…the tables are beautiful! I just worked through this tutorial and I have encountered a problem. In my table one of my variables is “Sex”, the attribute label of this variable in my data frame is also “Sex”. However, when the table prints out, the rgroup label says “Male sex”. When I check the attributes of the data object created by getDescriptionStatsBy, it now says: “Male sex” for the label. Incidently, in the table, the first category listed is also “Male sex”, even though the labels for the factors are “Male” and “Female”. Do you know why it changed?

    • This is partially a bug that is fixed in the develop branch (see how to install that branch here). If you have a true proportion then the function automatically adds the level + varname as this makes more sense – the bug occurred when forcing factors by changing the prop_fn function to describeFactors when this should not occur.

      I generally recommend using the mergeDesc-function that makes everything much easier, there is a vignette in the package that describes how to use it – I’m currently lagging a little behind on my R-blogging but I hope to write a post about it soon.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.