Since initial publishing my forestplot package, dplyr and tidyverse have become evermore dominant in how we think about data and data management. I have therefore just published a 2.0 version that uses many of the awesome select & group features that make tidyverse such a compelling tool. Continue reading
Category Archives: R
A few tips for beginners on how to engage with the open source community
I have been writing code in multiple languages since 1994 and for almost a decade now I have been active within the open source community, primarily in R but I’ve also published some JavaSript/TypeScript packages. Publishing something that other people enjoy is a pure joy and the fact that some of my packages have download counts in the thousands is something that truly warms my heart.
Throughout the years I have though noticed that there is newcomers to the open source community often struggle with how to get help, which is why I decided to write this post. I’ll start out with some basics and then a little more advanced topics such as how to write an issue or a pull request. Continue reading
Easiest flowcharts eveR?
A flowchart is a type of diagram that represents a workflow or process. The building blocks are boxes and the arrows that connect them. If you have submitted any research paper the last 10 years you have almost inevitably been asked to produce a flowchart on how you generated your data. While there are excellent click-and-draw tools I have always found it to be much nicer to code my charts. In this post I will go through some of the abilities in my Gmisc package that makes this process smoother. Continue reading
News in htmlTable 2.0
The htmlTable 2.0 package was just released on CRAN! It is my most downloaded package with 160 000+ downloads/month and this update is something that I have been wanting to do for a long time. For those of you that never encountered htmlTable it is a package that takes a `matrix`/`data.frame` and outputs a nicely formatted HTML table. When I created the package there weren’t that many alternatives and knitr was this new thing that everyone was excited about, magrittr with its ubiquitous `%>%` pipe had not even entered the scene. The current update should make it easier to streamline table look, separate layout from content and use tidyverse functionality. Continue reading
Long-awaited updates to htmlTable
One of the most pleasant surprises has been the popularity of my htmlTable-package with more than 100k downloads per month. This is all thanks to more popular packages relying on it and the web expanding beyond its original boundaries. As a thank you I have taken the time to update and fix some features (the 1.13 release) – enjoy! Continue reading
Dealing with non-proportional hazards in R
Since I’m frequently working with large datasets and survival data I often find that the proportional hazards assumption for the Cox regressions doesn’t hold. In my most recent study on cardiovascular deaths after total hip arthroplasty the coefficient was close to zero when looking at the period between 5 and 21 years after surgery. Grambsch and Thernau’s test for non-proportionality hinted though of a problem and as I explored it there was a clear correlation between mortality and hip arthroplasty surgery. The effect increased over time, just as we had originally thought, see below figure. In this post I’ll try to show how I handle with non-proportional hazards in R. Continue reading
R trends in 2015 (based on cranlogs)
It is always fun to look back and reflect on the past year. Inspired by Christoph Safferling’s post on top packages from published in 2015, I decided to have my own go at the top R trends of 2015. Contrary to Safferling’s post I’ll try to also (1) look at packages from previous years that hit the big league, (2) what top R coders we have in the community, and then (2) round-up with my own 2015-R-experience. Continue reading
Introducing the htmlTable-package
My htmlTable
-function has perhaps been one of my most successful projects. I developed it in order to get tables matching those available in top medical journals. As the function has grown I’ve decided to separate it from my Gmisc-package into a separate package, and at the time of writing this I’ve just released the 1.3 version. While htmlTable
allows for creating plain tables without any fancy formatting (see usage vignette) it is primarily aimed at complex tables. In this post I’ll try to show you what you can do and how to tame some of the more advanced features. Continue reading
How-to go parallel in R – basics + tips
Today is a good day to start parallelizing your code. I’ve been using the parallel package since its integration with R (v. 2.14.0) and its much easier than it at first seems. In this post I’ll go through the basics for implementing parallel computations in R, cover a few common pitfalls, and give tips on how to avoid them. Continue reading
An exercise in non-linear modeling
In my previous post I wrote about the importance of age and why it is a good idea to try avoiding modeling it as a linear variable. In this post I will go through multiple options for (1) modeling non-linear effects in a linear regression setting, (2) benchmark the methods on a real dataset, and (3) look at how the non-linearities actually look. The post is based on the supplement in my article on age and health-related quality of life (HRQoL). Continue reading