Blog: Zilic: Detect any disease with machine learning
Using machine learning to detect any disease with a single test
Machine learning has transformed healthcare. It’s being used to diagnose lung cancer, pneumonia, and other diseases. Machine learning is more accurate and faster at diagnosis than real doctors.
Despite all this innovation, there’s still a huge problem.
Errors in diagnosis still contribute to 1 in 10 patient deaths in hospitals.
To figure out what’s wrong, let’s go back to the future.
2050: Year of the RoboDoctor
The year is 2050, and you go to the hospital for a routine checkup. A RoboDoctor greets you and then takes a CT scan of your lung. Using machine learning the RoboDoctor checks the CT scan for the 3 most common lung disorders: tumors, pneumonia, and tuberculosis.
Lucky for you, all the tests come back negative. You’re good to go!
But a day later, you start coughing, you’re chest hurts, and you have trouble breathing. You go back to the RoboDoctor and start yelling at them. Strangely enough, Mr. Robot tells you this is a common problem. This time they’ll use machine learning to scan you for the top 5 most common lung disorders. Sure enough, the results come back and it turns out you have bronchitis, lung disorder #5.
They promise that from now on they’ll always check for the 5 most common disorders. But people keep on coming into the hospital, told they’re healthy, and then finding out later that they actually were sick with a disease that’s not on the list of diseases machine learning was looking for.
Here’s why the RoboDoctors keep on messing up.
Current machine learning techniques for disease detection can only find what they’re looking for.
If a machine learning model is looking for lung cancer, it’s never going to find pneumonia. Adding more diseases to test for works for common diseases. But if you’re only searching for the 50 most common diseases, your out of luck if you have disease #51. Even worse, no matter how many diseases the robot doctors check for using machine learning, they’ll never be able to detect rare or new diseases.
A good example of this is me writing “your” instead of “you’re” two sentences ago. Unless your an online grammar warrior, you probably wouldn’t have noticed. Finding things you aren’t looking for is very hard.
This problem exists even more so in machines because machine learning methods for diagnosis are what’s known as supervised learning methods. In supervised learning, you give a neural network a bunch of labelled data, and the neural network learns over time classify new data into one of these categories.
For a supervised learning problem to learn how to detect pneumonia, it needs a TON of images of lungs with pneumonia, and a whole bunch of other images of healthy lungs.
The issue here is that there’s not a lot of data for rare or new diseases.
Because supervised learning needs so much data, it’s not able to learn how to detect rare or new diseases.
Back to the present
So how are human doctors today able to detect rare or new diseases in CT scans? A lot of the time they don’t, but when they do it’s because they find an anomaly. The doctor knows what a healthy lung looks like, and is able to notice when something doesn’t look quite right.
Anomaly detection is kinda like those “which one doesn’t belong” games you played as a kid. You look at a couple of images and figure out what they have in common. Then when given a new image you can tell whether it’s part of the group, or if it’s different (an anomaly).
Even though all the objects in this image are different, it’s pretty easy to figure out that 3 of the objects are balls, and that the block is the anomaly.
Anomaly detection for medical images is pretty similar to this. Except instead of balls we have healthy CT scans of lungs. When given a new image we want to be able to tell if it’s healthy and part of the main group (a ball), or different in some key way (like the block).
So how do we use machine learning to look for anomalies in medical images?
You make it yourself!
I realized how big of a problem disease detection is in healthcare. It’s crazy to me how 1 in 10 patient deaths are caused by errors in diagnosis. Over the past few months I’ve built Zilic, a project that uses machine learning to detect any disease in a medical image with a single test.
Here’s how Zilic works:
- No really magic
- Some machine learning stuff like GANs and autoencoders
Here’s how the magic works.
Using GANs to learn what makes an image healthy
In GANs, two neural networks are trained against each other.. The first neural network is the Generator, which tries to generate ‘fake’ images of healthy lungs that look just like the real thing. The second neural network is the Discriminator, which tries to identify whether an image is real, or a fake created by the Generator.
Over time the two neural networks keep on pushing the other one to get better. At the end of training, the generator is able to create realistic looking images of healthy lungs, and the discriminator is really good at spotting fakes.
This ain’t any ol’ GAN though. The secret sauce of Zilic is that the generator is made out of an autoencoder.
Autoencoders are another machine learning technique. They basically learn a compression algorithm for a specific dataset. They can take in a 200×200 image (120,000 different numbers), and convert that into a representation only 100 values long!
Autoencoders kind of learn how to do what we do when we talk. When you describe a donut to a friend, you don’t tell them every detail. You’d just say “It’s a donut with pink icing and sprinkles.” This compressed description of the donut is a lot more useful. And that’s pretty much what an autoencoder does — takes an image and turns it into a compact number representation.
Autoencoders make these compressed descriptions with 3 parts:
The encoder takes in the image and compresses it down to a smaller representation, called the code. Then the decoder takes the code and (wait for it), decodes it back to the size of the original input.
If the autoencoder is working well, the encoder will learn compact features to store in the code, and the input and output (generated by the decoder) will look the same.
But wait, there’s more!
Once the encoder is trained, it can make accurate encodings for any image in the dataset. This means we can now generate encodings for every image of a healthy lung, and get a distribution of all these healthy images.
Let’s bring this all together
Here’s how Zilic uses GANs and autoencoders to detect any disease in a medical image
The GAN and autoencoder are only trained on healthy images of lungs. They get really good at encoding and decoding healthy images. Here’s some of the reconstructions Zilic’s autoencoder learned how to make.
Since the GAN and autoencoder are only trained on images of healthy lungs when the autoencoder is given an image of an unhealthy lung, it’s reconstruction (output of the decoder) will look really bad.
How bad is it? As bad as using “it’s” instead of “its” when you’re indicating possession.
What’s really cool is that where the reconstruction is worst, is also where the disease is most prevalent. The diseased parts of the image are really different from what the autoencoder was trained on, and so the reconstruction will look most different from the original where the disease is.
Zilic can automate this! Using mean pixel difference, an algorithm that sees how different two images are, Zilic automatically highlights where the disease is in the image.
But wait, there’s more! (part 2)
We can also compare the encoding of the test image with the distribution of healthy images. If the encoding is within the distribution, the test image is healthy. If it’s outside of the distribution it’s unhealthy.
Why use 2 metrics to test for diseases? Why write medium articles? Why not?
Also, adding this metric takes rare disease detection for lungs from 60% accurate (reconstruction only) to 80% accurate (reconstruction and distribution loss).
Finally, we can use the discriminator from the GAN. It’s been trained to give low scores for healthy images of lungs, and it’ll give slightly higher scores for images of lungs with diseases.
Putting these 3 loss metrics together, Zilic is able to detect 85% of rare diseases in CT scans of lungs. Even better, this approach can scale to any type of medical image. All you need is a few thousand healthy images, and then Zilic can spot any disease.
Here’s an example of Zilic in action.
It’s able to find the disease in each of these images. Each disease is different, but Zilic only needs one test to detect them all.
And that’s how the sausage is made!
Here’s a quick recap of how Zilic works
- Reconstruction loss
- Out of distribution loss
- Discriminator loss
Using these 5 techniques, Zilic is able to detect rare and even completely new diseases with over 85% accuracy.
Zilic acts as a safety net for all diseases, no matter how rare. It could call doctor’s attention to scans with diseases. This will save doctor’s a lot of time and help them make fewer mistakes. Now, it’ll be almost impossible for a rare disease to go undiagnosed if it’s possible to see it in the image. We can make sure that errors in diagnosis never cause another patient death.
The year is 2050, and because of Zilic, RoboDoctors won’t ever have to worry about missing a disease.