Integration between torchnet and torch-dataframe – a closer look at the mnist example

It's all about the numbers and getting the tensors right. The image is cc by David Asch .

It’s all about the numbers and getting the tensors right. The image is cc by David Asch
.

In previous posts we’ve looked into the basic structure of the torch-dataframe package. In this post we’ll go through the [mnist example][mnist ex] that shows how to best integrate the dataframe with [torchnet](https://github.com/torchnet/torchnet). Continue reading

The torch-dataframe – subsetting and sampling

Subsetting and batching is like dealing cards - should be random unless you are doing a trick. The image is cc from Steven Depolo.

Subsetting and batching is like dealing cards – should be random unless you are doing a trick. The image is cc from Steven Depolo.

In my previous two posts I covered the most basic data manipulation that you may need. In this post I’ll try to give a quick introduction to some of the sampling methods that we can use in our machine learning projects. Continue reading

The torch-dataframe – basics on modifications

Forming your data to your needs is crucial. The image i cc by Lennart Tange.

Forming your data to your needs is crucial. The image i cc by Lennart Tange.

In my [previous post][intro post] we took a look at some of the basic functionality. In this post I’ll try to show how to manipulate your dataframe. Note though, the [torch-dataframe][tdf github] is not about data munging, there are far more powerful tools in other languages for this. The aim of the modifications is to do simple tasks without being forced to switch to a different language. Continue reading

Deep learning with torch-dataframe – a gentle introduction to Torch

[![A solid concrete foundation is always important. The image is cc by Sharon Pazner ](http://gforge.se/wp-content/uploads/2016/07/Lego-house-concrete.jpg)](http://gforge.se/wp-content/uploads/2016/07/Lego-house-concrete.jpg) A solid concrete foundation is always important. The image is cc by[
Sharon Pazner
](https://flic.kr/p/nSNQzw)

Handling [tabular data](https://en.wikipedia.org/wiki/Table_(information)) is generally at the heart of most research projects. As I started exploring [Torch](http://torch.ch/) that uses the [Lua](https://www.lua.org/) language for [deep learning](https://en.wikipedia.org/wiki/Deep_learning) I was surprised that there was no package that would correspond to the functionality available in R’s [data.frame](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html). After some searching I found Alex Mili’s [torch-dataframe](https://github.com/AlexMili/torch-dataframe) package that I decided to update to my needs. We have during the past few months been developing the package and it has now made it onto the Torch [cheat sheet](https://github.com/torch/torch7/wiki/Cheatsheet#data-formats) (partly the reason for the posting scarcity lately). This series of posts provide a short introduction to the package (version 1.5) and examples of how to implement basic networks in Torch. Continue reading

Dealing with non-proportional hazards in R

As things change over time so should our statistical models. The image is CC by Prad Prathivi(

As things change over time so should our statistical models. The image is CC by Prad Prathivi

Since I’m frequently working with large datasets and survival data I often find that the proportional hazards assumption for the Cox regressions doesn’t hold. In my most recent study on cardiovascular deaths after total hip arthroplasty the coefficient was close to zero when looking at the period between 5 and 21 years after surgery. Grambsch and Thernau’s test for non-proportionality hinted though of a problem and as I explored it there was a clear correlation between mortality and hip arthroplasty surgery. The effect increased over time, just as we had originally thought, see below figure. In this post I’ll try to show how I handle with non-proportional hazards in R. Continue reading

Benchmarking ReLU and PReLU using MNIST and Theano

The abilities of deep learning are fascinating, just as this Paschke arch CC by  David DeHetre

The abilities of deep learning are fascinating, just as this Paschke arch CC by David DeHetre

One of the successful insights to training neural networks has been the rectified linear unit, or short the ReLU, as a fast alternative to the traditional activation functions such as the sigmoid or the tanh. One of the major advantages of the simle ReLu is that it does not saturate at the upper end, thus the network is able to distinguish a poor answer from a really poor answer and correct accordingly.

A schematic of the PReLU. The PReLU has the same schematic with the only difference being the α being a constant. Curtesy PReLU article.

A schematic of the PReLU. The LReLU has the same schematic with the only difference being the α being a constant. Curtesy PReLU article.

A modification to the ReLU, the Leaky ReLU, that would not saturate in the opposite direction has been tested but did not help. Interestingly in a recent paper by the Microsoft© deep learning team, He et al. revisited the subject and introduced a Parametric ReLU, the PReLU, achieving superhuman performance on the imagenet. The PReLU learns the parameter α (alpha) and adjusts it through basic gradient descent.

In this tutorial I will benchmark a few different implementations of the ReLU and PReLU together with Theano. The benchmark test will be on the MNIST database, mostly for convenience. Continue reading

Introducing the htmlTable-package

How should we convey complex data? The image is is CC by Sacha Fernandez.

How should we convey complex data? The image is is CC by Sacha Fernandez.

My htmlTable-function has perhaps been one of my most successful projects. I developed it in order to get tables matching those available in top medical journals. As the function has grown I’ve decided to separate it from my Gmisc-package into a separate package, and at the time of writing this I’ve just released the 1.3 version. While htmlTable allows for creating plain tables without any fancy formatting (see usage vignette) it is primarily aimed at complex tables. In this post I’ll try to show you what you can do and how to tame some of the more advanced features. Continue reading

How-to go parallel in R – basics + tips

Don’t waist another second, start parallelizing your computations today! The image is CC by  Smudge 9000

Don’t waist another second, start parallelizing your computations today! The image is CC by Smudge 9000

Today is a good day to start parallelizing your code. I’ve been using the parallel package since its integration with R (v. 2.14.0) and its much easier than it at first seems. In this post I’ll go through the basics for implementing parallel computations in R, cover a few common pitfalls, and give tips on how to avoid them. Continue reading

Fast-track publishing using the new R markdown – a tutorial and a quick look behind the scenes

The new rmarkdown revolution has started. The image is CC by Jonathan Cohen.

The new rmarkdown revolution has started. The image is CC by Jonathan Cohen.

The new R Markdown (rmarkdown-package) introduced in Rstudio 0.98.978 provides some neat features by combining the awesome knitr-package and the pandoc-system. The system allows for some neat simplifications of the fast-track-publishing (ftp) idea using so called formats. I’ve created a new package, the Grmd-package, with an extension to the html_document format, called the docx_document. The formatter allows an almost pain-free preparing of MS Word compatible web-pages.

In this post I’ll (1) give a tutorial on how to use the docx_document, (2) go behind the scenes of the new rmarkdown-package and RStudio ≥ 0.98.978, (3) show what problems currently exists when skipping some of the steps outlined in the tutorial. Continue reading

Pimping your forest plot

A forest plot using different markers for the two groups

A forest plot using different markers for the two groups

In order to celebrate my Gmisc-package being on CRAN I decided to pimp up the forestplot2 function. I had a post on this subject and one of the suggestions I got from the comments was the ability to change the default box marker to something else. This idea had been in my mind for a while and I therefore put it into practice. Continue reading