Blog: How relevant is Kaggle experience to developing commercial AI?
Many data science and AI hiring managers require job candidates to have PhD qualification and/or competition experience which I suspect many of the managers do not have themselves. One of my PhD-qualified CEO friend said Kaggle was only for students who have tonnes of spare time.
Well, how applicable or transferable is skills used in competition to deploying commercial applications? In order to find out, I decided to take part in a Kaggle competition. I did join one competition a while back to get a feel of it but wasn’t serious at all due to work commitment (like my friend said). After finishing my last contract project, I decided to take some break and do a Kaggle competition properly — using AI to detect fine grained fashion attributes. These are what I learned from my commercial and Kaggle experiences.
1. “Don’t Be A Hero”
I quote this from Andrej Kaparthy, AI Director of Tesla. He said, don’t be a hero, don’t try to create an AI model from scratch. Instead, just pick one of existing models created by top scientist. This is absolutely true for both Kaggle and commercial deployment! Most of the problems are not completely new, for example, there are already tonnes of models for object detection. The thing you should do is do a literature review of state-of-the-art models and pick one that best meet your requirements and start from that. Nobody really create model from scratch, even researchers normally build their models on top of some previous models.
When I started the competition, I wrote a basic model from scratch (which was state-of-the-art back in 2016) and was happy to get high score after first submission. In one day, all of sudden I was leapfrogged by many by huge margin. Funny things is, many of them had the exactly same scores, then I realized someone has published kernel using newer model architecture and they simply run the code
2. Don’t Start From Scratch
This is generalization of the first point. AI model is only one part of the works, there are other codes that needed to set up the whole model training pipeline, including pre-processing and validation tests. Writing utility functions for complex works e.g. pre-processing data to generate mask, calculation of precision-recall etc are tedious. While they are necessary but they are also pretty standards, there is no need to re-invent to wheel. For Kaggle, there are usually kind participants who publish their code (known as ‘kernel’ in Kaggle) that you can simply download and run to train your model. There are also many Github repo for well-known models in various machine learning framework but of course quiet many of them don’t work straight out of the box due to various software compatibility issues.
3. Visualize Your Data
One thing that often get overlooked is the data. In Kaggle, the training data has been collected, annotated and formatted nicely for you. However, there is always chance of getting some wrong data maybe corrupted image, wrong size or wrong labels, it is important to examine the data like plotting the distribution and visualize the data. In commercial world, the reality is a lot uglier, data collection and cleaning takes a lot of effort which often overshadow the time it takes to write (or clone) an AI model. “Rubbish in, rubbish out”, be sure you put in effort to cleaning the data before even start the training.
4. Use The (Best) Largest Models?
People would choose the biggest model or even best, ensemble of biggest models to make predictions. There is no regard of resources like GPUs or time as long as you could afford them. I suppose some could even use fake AI, that is, to recruit human, pretending to be AI to do the job. This is the biggest difference with commercial deployment where there are constraints on computational resources (GPUs and cost), inference time and latency. This is also the reason we don’t see a lot large scale AI deployment and this is also something I have been focusing on lately to run AI model faster and more efficiently. I once worked with a start-up developing hardware accelerator, and they wanted to do replicate an experiment to use 8 different models to make one single prediction, only because the paper claimed to achieve state-of-the-art performance. However, “Doesn’t squeezing 8 models into a chip (which takes up either 8 times more memory or time or silicon space) defeat the very purpose of having an accelerator to make inference faster?”, I asked. They realized that a while later and the project was cancelled.
5. Are Numbers All Reliable?
In competition, the only thing that matters is the test score i.e. the metrics used to decide the ranking. In competition, I would try every tricks to get an improvement of 0.001 to climb the rank! Usually, the object carries same weights in test score, that is to say, correctly identify an UFO gives same score as classifying a bird correctly, I would optimize my model to detect UFO at the expense of bird if that is what will give higher score! In real world deployment, as long as accuracy is decent, it doesn’t really matter if new model is 0.1% less accurate. What is more important is the human perception of the AI predictions, most people would forgive it AI fails to spot a tiny UFO flying at night because this is something we would likely have missed compared to missing a large aeroplane right in the center of the picture. Believe me, you can still get decent metrics even if the AI predictions are horribly wrong from human perspective, always always manually examine the results.
6. Do You Need Massive GPUs?
I think most of you heard that one need massive, expensive GPUs for AI training, is that true? I was tempted to be lazy just say YES but this is not always true. Tech giants like Google have vast computational resources which of course they rightly utilize to train their AI models, therefore normal folks like us can’t normally replicate their results. Increasingly, companies are joining the competitions, and I heard one of them used 512 GPUs to win a second place. One of my Kaggle Grandmaster told me, he never take part in computer vision competition because he ONLY had a 1080Ti Nvidia GPU which was already top of range consumer GPU at the time. My thought is, yes, you’ll probably need not just one but 4 or 8 GPUs to secure gold or silver model, as it allows you quickly try different experiments and hyperparameter tunings. If you already fairly experienced, know which models to use, which hyperparameters to tune, then you don’ need that kind of fire powers. If you were to buy a GPU, go for top of the range consumer GPU like RTX2080Ti which is big enough.
For practical commercial deployment, we don’t actually want to use a massive model. If the model is small, multi-GPUs don’t usually help in accelerate the training time. This is because there is overhead is distributing the training data and model weights to different GPUs and if the models are small, the GPUs could spend long time waiting for data transfer to complete.
7. Software Engineering
Many good Kagglers write messy codes that is because many of them work alone and there is no need to explain their code to others. Also, because at the end of the day, all you’ll need to submit is your prediction results. I once had heated debate with a co-worker who was an experienced competitor. Despite the looming of project deadline, he wanted to keep trying newest and biggest AI models, I totally understood why, for the aforementioned reason. In commercial software development, there are a lot more work to do before final deployment including packaging the AI models into correct format for production and a lot of testing and validations.
In my opinion, Kaggle competitions in computer vision is closer to applied research for graduate students than designing solution for commercially viable projects that come with many constraints. However, the technical skills needed to win are a solid one. Eventually I won a bronze medal and I have huge respects for all the medalists especially gold and silver, for obvious reasons. This is because I understand the technical capabilities and efforts required. However, I don’t think hiring managers should ignore a job candidates simply because he or she did not podium in the competition because this can be an unfair game against well resourced teams.