Getting to the point – an alternative to the bezier arrow

An alternative bezier arrow to the regular grid-bezier. Apart from a cool gradient it has the advantages of: exact width, exact start/end points and axis alignment.

An alternative bezier arrow to the regular grid-bezier. Apart from a cool gradient it has the advantages of: exact width, exact start/end points and axis alignment.

About two weeks ago I got frustrated with the bezierGrob function in the grid package. The lwd parameter is interpreted differently depending on device, the arrow at the end does not follow the line but is perpendicular (probably following the spline control), and the line parameter makes it difficult to control exactly where the line starts/ends. Thus I decided to make my own fancy line with an arrow at the end – at the time I thought: How hard can it be? In retrospect, I wish I never thought of the thing… This article is about the painful process of creating of an alternative to the bezierGrob. Continue reading

Using the SVD to find the needle in the haystack

Sitting on a data set with too many variables? The SVD can be a valuable tool when you're trying to sift through a large group of continuos variables. The image is CC by Jonas in China.

Sitting with a data set with too many variables? The SVD can be a valuable tool when you’re trying to sift through a large group of continuos variables. The image is CC by Jonas in China.

It can feel like a daunting task when you have a > 20 variables to find the few variables that you actually “need”. In this article I describe how the singular value decomposition (SVD) can be applied to this problem. While the traditional approach to using SVD:s isn’t that applicable in my research, I recently attended Jeff Leek’s Coursera class on Data analysis that introduced me to a new way of using the SVD. In this post I expand somewhat on his ideas, provide a simulation, and hopefully I’ll provide you a new additional tool for exploring data. Continue reading

Exporting plain, lattice, or ggplot graphics

A blend between a basic scatterplot, lattice scatterplot and a ggplot

A blend between a basic scatterplot, lattice scatterplot and a ggplot

In a recent post I compared the Cairo packages with the base package for exporting graphs. Matt Neilson was kind enough to share in a comment that the Cairo library is now by default included in R, although you need to specify the type=”cairo” option to invoke it. In this post I examine how the ggplot and the lattice packages behave when exporting. Continue reading

Exporting nice plots from R

It's not always easy getting the right size. The image is CC by Kristina Gill.

It’s not always easy getting the right size. The image is CC by Kristina Gill.

A vital part of statistics is producing nice plots, an area where R is outstanding. The graphical ablility of R is often listed as a major reason for choosing the language. It is therefore funny that exporting these plots is such an issue in Windows. This post is all about how to export anti-aliased, high resolution plots from R in Windows. Continue reading

Tables from R into Word

A good looking table matters!

A good looking table matters!

This tutorial is on how to create a neat table in Word by combining knitr and R Markdown. I’ll be using my own function, htmlTable, from the Gmisc package.

Update: With the latest RStudio verions getting tables from R into Word is even easier, see my new post on the subject.

Background: Because most journals that I submit to want the documents in Word and not LaTeX, converting my output into Word is essential. I used to rely on converting LaTeX into Word but this was tricky, full of bugs and still needed tweaking at the end. With R Markdown and LibreOffice it’s actually rather smooth sailing, although I must admit that I’m disappointed at how bad Word handles html. Continue reading

Chocolate and the Nobel Prize – a true story?

Chocolate - a close up picture

Few of us can resist chocolate, but the real question is: should we even try to resist it? The image is CC by Tasumi1968.

As a dark chocolate addict I was relieved to see Messerli’s ecological study on chocolate consumption and the relation to the Nobel prize. By scraping various on-line sources he made a robust case for that increased chocolate consumption correlates to the number of Nobel prizes. Combined with that it might have positive impact on blood pressure, the evidence is strong enough for me to avoid changing any habits, at least over Christmas 🙂

Tutorial: Scraping the chocolate data with R

Inspired by Messerli’s article I decided to look into how to repeat the analysis in R. Continue reading

Creating an R package in Windows

The packaging of "nothing". An aura or vibe or spirit. To showcase that sometimes the packaging and the perception of the product, IS the product.

A nice package can be both beautiful and functional. The image is CC by MIAD Communication Design.

Inspired by this post by Szilard Pafka I decided to do a similar adventure in a Windows environment and see what problems I run into.

Start by installing Eclipse & StatET, the installation can sometimes cause some annoyances. I’ve covered a lot of them in my previous post. Continue reading

Getting started with Sweave & knitr

Cool woven artwork on the campus of Kansas University. The image is CC by http://www.flickr.com/photos/kansasphoto/4682126666/

Cool woven artwork on the campus of Kansas University. The image is CC by Patrick Emerson

I recently started to work with Sweave (by Friedrich Leisch) and found it a truly awesome package. The ease of use is amazing. In this post I’ll try to get you started with first Sweave and then the knitr (by Yihui Xie). The knitr package is a more advanced version of Sweave, update: Start with knitr as it’s really well integrated into RStudio and is more actively developed.

Reasons for learning LaTeX & Sweave/knitr:

  • You can export formatted tables (ready for publication)
  • You connect the results with the actual calculations, minimizing risk of “copy->paste” errors
  • The code is “automatically documented” as you explain the results in the text
  • You can easily re-run the report on a new dataset

Now lets get started… Continue reading

My two favorite IDE’s for R – tips & tricks

The two IDE that I use for R are RStudio and Eclipse with StatET. They complement each other nicely, RStudio works out of the box while I previously shown how to get Eclipse & StatET going, you can find it here, which is slightly challanging.

RStudio

I use RStudio for all my statistics where I don’t want to create functions or more advanced programming. It’s great since it allows me to get immediate help, code completes the initiated variables. The settings are simple and you hardly need to do anything. Continue reading