Our AI on par with humans?

The first step in orthopedic deep learning. The image is CC from Pixabay.

We finally published our first article on deep learning (a form of artificial intelligence, AI) in orthopedics! We got standard off-the-shelf neural networks to perform equally well as senior orthopedic surgeons for identifying fractures. This was under the premise that both the network and the surgeons reviewed the same down-scaled images. Nevertheless, this was better than we expected and verifies my belief that deep learning is suitable for analyzing orthopedic radiographs.

Stuff I learned on the way

When reading the paper it seems ridiculously straight-forward, it is hard to believe that I embarked on this project already back in 2014, just after finishing my thesis. Although everyone was very enthusiastic and helpful, it took about a year before we had a reasonable set of images. Our first attempts at deep learning mid 2015 that failed miserably… here are some of the things that we improved before the current paper:

  • Noisy labels: Our initial algorithm for identifying the presence of a fracture based on the radiologist’s report was too inaccurate. While some noise is OK, too noisy labels risk confusing the network. We refined the algorithm using domain knowledge of patterns and anti-patterns present in the reports together with meta-data. This is actually a common problem when applying deep learning to medical images as the reports vary widely.
  • Cropping: As the networks optimized for ImageNet are fine with images of less than 300 x 300 pixels resolution it is important to use those pixels wisely. A good radiographic image should only contain the body part of interest, in order to attain this the images often contain a black/white frame (due to the shutters that focus the beam to the area of interest). We managed to get a tight crop around the image of interest by using OpenCV. Note that the frame may be rotated and you should try to align it as much as possible.
  • Contrast enhancement: The images used have a depth of 8-bits. The originals are generally between 12 and 16 bits and it is therefore important to make sure that you use the 8-bits efficiently.
  • Parameters: Deep learning is about tweaking you parameters. Most of the learning rates that I tried at first were too high, once I reduced the learning rate significantly it worked much better. The core principle is that you need to systematically try different learning rates, optimizers, samplers etc before giving up.

Where are we heading from here?

While this initial paper was interesting, we are not at the level where we can start of deploying in the real world. Many may think that a fractures is either broken or not, it is actually much more complex. We are thus expanding the number of outcomes, both by using more refined text analyses of the reports and manual labeling. I started developing a platform for the latter last summer, and managed to get all the pieces to work as a month ago. This will allow us to efficiently manually annotate the large quantity of images that deep learning requires.

We are at the same time exploring options of growing the network to full-scale images. Through a collaboration with KTH’s supercomputer center (formally known as the PDC Center for High Performance Computing) we hope to be able to scale without network limitations. Since November we’ve also been exploring new architectures and combinations that work with regular GPUs.

Will doctors be around in 10-20 years?

Doctors will probably be around, but I think our job will be nothing like today. I’m not afraid of “bots taking our jobs”, surprisingly few of my patients trust (or even perform) Google searches today – not sure an AI-bot will change this. It will more likely improve our diagnostic accuracy and better explain the cost/benefit of different treatment options.

I believe that our conversation with the patient will be much more interesting, we will be able to properly guide a patient through the possible choices. Many things in medicine are neither good or bad. An AI may be better at deciphering, but I believe that most patients want to talk to someone and discuss how they approach this mix. Although once we have Sci-Fi humanoid robots, you will find me at someplace where you can windsurf 🙂

Flattr this!

This entry was posted in Deep learning, Orthopaedic surgery, Torch. Bookmark the permalink.

Leave a Reply