# The forestplot of dreams

Displaying large regression models without overwhelming the reader can be challenging. I believe that forestplots are amazingly well suited for this. The plot gives a quick understanding of the estimates position in comparison to other estimates, while also showcasing the uncertainty. This project started with some minor tweaks to prof. Thomas Lumleys forestplot and ended up in a complete remake of the function. In this post I’ll show you how to tame the plot using data from my latest article.

I’ve used the plots in my two latest publications and they have had a warm reception. In the latest study we compared Swedish with Danish patient’s health related quality of life after a total hip arthroplasty. We were interested in comparing if there was a difference between common explanatory variables in Denmark and Sweden, i.e. the generalizability. The Swedish data set was vast while the Danish was a tiny sample resulting in very different confidence intervals. You can find the main graph below.

## Tutorial data

Below you can find a data dump for the estimates that I used to generate the plot.

## The forestplot tutorial

Lets start with the basic code to generate a simple forestplot. The xticks parameter is not necessary but in this particular example the 0.05 tick is otherwise not included and I therefore added it. Note: you need to download my package, you can find the Gmisc-package here, for this tutorial to work.

## Greek letters

One of the first things that got me tweaking the original forestplot function was that I wanted to have a simple expression in the header. Getting a matrix into the original function was rather simple, as you can see below:

As many of you know it is impossible to combine different types in one matrix/vector and you therefore need to supple the function a list. The number of elements in the list have to be m x n as in any matrix, below is the example plot that I was originally aiming for:

Ironically after figuring this out my main supervisor correctly pointed out, while the β is correct – it doesn’t really add that much while there is a risk of alienating less statistically oriented readers…

## Multi-line forestplots

After this set-back I was at least familiar with the forestplot function allowing further tweaking. Early on I realized that it can be convenient to display the same risk factors multiple times. The idea was that in situations where there are different outcomes, for instance hip replacement re-operation due to infection, dislocation or fracture it can be useful to see the estimates adjacent to each-other. I also used it for comparing Cox proportional hazards models with competing risk regressions and Poisson regression. It turned out to be rather useful.

In the Sweden/Denmark paper the comparison is simply between countries. The code is fairly straight forward, you now need to have to provide the function with a m x n matrix for the mean, lower and upper parameters. The n should be the number of comparison groups, in the Sweden vs Denmark paper the groups are two:

Originally I was using the legend() function but it turned out to be rather complicated. You can now achieve the same things with the legend parameters, here’s an example with a rounded box at the top left corner:

I hope you’ll find the forestplot function as useful as I have.

This entry was posted in R, Tutorial. Bookmark the permalink.

### 8 Responses to The forestplot of dreams

1. G. Grothendieck says:

Very nice. I would make sure that the legend is in the same order as the graphic content as in the second plot but not in the first plot. Also since the content is arranged vertically it would be easier to read if the legend were also arranged vertically as in the first few plots but not in the first plot in the Multi-Line Forestplots section.

• Max Gordon says:

Thanks, excellent suggestion. I’ll make sure to include the enhancement in the next version.

2. Rick says:

Beautiful idea, Max. Thanks so much for the effort and the code. I will try to use it in my next paper.

3. Rick says:

Is there a way to produce a forest plot with something other than the BOX as the marker ? I know it’s a common way to show the mean value, but it’s a bit imprecise. It would be nice to use a diamond, for instance.

Also — is it possible to increase the size of the X axis tick mark labels? It seems to ignore cex.

Thanks.

• Max Gordon says:

Hi Rick,

I’ve added the option of customizing the drawing of the confidence interval. You can now use diamonds, circles and points. Download the latest github-version, please try the different options and please report any bugs. Any clarifications to the manual are greatly appreciated. In the update you can also choose different drawing functions for each confidence band.

I’ve also added the cex.axis option that addresses the axis size issue.

/Max

4. Shakthi says:

Dear Max Gordon,

The forestplot you have made is extremely useful. I am trying to plot similiar kind for X and Y.
But X and Y are two groups in my plot and does not contain equal coef, lower and upper limit. The X contain 22 values and 2 set with NA (total – 24) and Y containes 24 set with values. When I tried to plot the values the plot for last two values are not displayed and represents a blank space in the picture.

Could you please tell me how to handle the NA values in X and make a picture with both X and Y. I am interested to plot the 24 values in the same picture with only 2 single lines for Y and no value for X.

Best,
Shakthi.

5. Penguin says:

Hi Max,

Thanks for this great code, and the code over on “pimp my forest plot”. I’m trying to replicate what you’re doing, but with a slight modification. I want 3 strata (in comparison to your 4) and within each stratum I want to plot three estimates (a main estimate and then two other secondary estimates for two subgroups).

The problem is: the code works great for 4 strata, but not for three.