Blog: Data science job hunt
I recently interviewed with top IT companies for roles like Applied scientist, Machine Learning Engineer and Data scientist in countries like Australia, USA, UK and Sweden. I thought it might be a good idea to share my experience in hope that it will help someone.
I come across a lot of ads for courses on Udemy and other platforms claiming to teach machine learning/ artificial intelligence skills in 1 or several courses and some platforms go as far as claiming that you do not need to know complex mathematics behind a lot of machine learning/ computer vision algorithms. Most of these courses are designed in a way to explain the algorithms theoretically (I took 1 course on Udemy where the instructor said “The math can get hairy really fast so we are not going to bother ourselves with it. The main thing to do is to know how to implement it”). Although I do agree that it is important to know how to implement most of the Machine learning algorithms and how to fine-tune the hyperparameters, I do not agree with not bothering with the underlying math.
A lot of the courses focus on teaching a bunch of techniques (SVM, KNN, regression, ANN, CNN, RNN, PCA, k-means, etc) but do not focus on explaining VC dimensions, basics of probability and linear algebra. I understand from the course designer’s perspective that it’s not possible to teach everything in 1 course but claiming that the stuff is not significantly important is misleading. A person who is familiar with importing libraries from scikit-learn, Keras, PyTorch, Theano, etc and understand the high-level implementations of most algorithm might sound like a good candidate for such a position but in reality, he/ she is not. Let’s assume that the candidate is familiar with data pre-processing (missing data, outlier detection, normalization vs standardization, bias-variance tradeoff, detecting correlation) but is not familiar with linear algebra. This person will spend a lot of time trying to debug errors which occur while preparing the dataset for advanced libraries. I can say this because I used to be one of such people and man, I sucked at my interviews initially. I can’t say if this is the standard process or I was the unlucky one to go through these interview questions but the following are some of the things that I was asked:
- Can you tell what the cost function for logistic regression is? How can one derive it?
- Can you tell the cost function for an artificial neural network (for classification task) and give a high-level overview of the backpropagation algorithm?
- How to avoid overfitting?
- Let’s say you have 3 matrices (sizes were given)? In what order to multiply them to reduce the number of computations?
- Given a matrix with column vectors, check for linear independence. What is the span?
- How does random forest algorithm work? What is tree pruning?
- 7. Difference between XGBoost, CatBoost
- How does the decision tree algorithm decide the root node?
- Difference between bagging and boosting?
- Jacobian vs Hessian? How does computing these optimize the backpropagation algorithm?
- Explain CNN briefly? Explain how dropout & batch normalization layers help avoid overfitting?
- For a given problem (some image classification or object detection task), will you prefer to start with a customized model or check how some of the well known CNN architectures perform on it? If later, which one would you choose and why?
- SQL queries, HDFS questions, some other distributed systems questions.
I will add more questions as I remember them. The point I am trying to make is that doing just a few courses online doesn’t help answer a lot of these questions. Online courses are great to skim over the material but people need to understand that they need to know some underlying math behind most of these techniques because of interviewers test on it (or at least I have been tested on them).
I know people who work in top companies with such job titles and more often than not, they end up testing the problem at hand on several models and build custom models to address different challenges encountered.
I hope that this article helps people pursuing a career in data science realize that companies don’t pay (approx.) $150K + benefits to knowing some high-level APIs or completing a few online courses. Of course, familiarity with these APIs adds value but that’s not everything that’s needed to land such a job. What they care about is your deeper understanding and reasoning on when to choose a certain model and why and how to customize it if needed.
I am planning to start a series on Machine learning exploring the underlying math behind different machine learning algorithms and since the medium is not so popular for its Latex integration, point to appropriate resources (apart from online courses) to get a good grip on these topics. Let me know in the comment section if something like this would help. I am also curious to know how was your experience interviewing for data science role. Good luck.