Blog: What Do You Need To Do Before Hiring A Data Scientist?
Hiring good Data Scientist is hard, retaining them is harder.
Artificial Intelligence brings a promise of exponential growth and taking your business to new heights. No wonder there is a lot of excitement around the application of Artificial Intelligence (AI).
Many companies are rushing to hire their first Data Scientist or build a Data Science team right off the bat. Their enthusiasm is understandable as they want to innovate with data and not be out-competed by the market. However, these early missteps and false starts are causing a massive opportunity cost to companies, and Data Scientists are moving on from these companies within just a couple of years.
Here are some recommendations for you to prepare before investing in Data Science function at your company:
1. Have A Clear Understanding Of Why You Want To Hire A Data Scientist
You can begin by identifying the business problems and opportunities you want them to address. You don’t necessarily need to have large amounts of data. but you definitely need some data that relates to the identified business problems. As an example, a spreadsheet with a few hundred thousand rows with the right data attributes covering the majority of population distribution may be enough.
If it’s deemed too risky to begin with a major business problem, you can start by shortlisting and prioritising simpler use cases like:
- Voice-based fraud detection and reduction in the customer care centre.
- Product recommendations for an eCommerce site.
- Predicting churn for a B2B SaaS company.
Let’s ask Data Scientist to look at the data we have and let them generate the business value — is a bad strategy.
Alternatively, you can hire an independent consultant or work with a service provider should you want to test the waters first before venturing into building a team yourself. (Disclaimer: I’m part of a Data Science Consulting business.)
2. Understand Data Science Profiles And Know What Kind Of Person(s) You Want To Hire
Data Scientists come from various backgrounds like any other profession (Marketers, Designers, Product Managers etc.). Some have expertise in building Machine Learning models while others are strong at Analytics and Visualisation. Some have only worked in Computer Vision while others only in Natural Language Processing (NLP). Some are generalists while others are specialists.
Hire the Data Scientist relevant to your problem space.
You don’t need to hire a PhD or use Kaggle as your primary means to assess someone. There are good data scientists with no PhD who have never participated in Kaggle competitions.
When you hire PhDs straight out of grandiose without thinking much, there is a bias for academic style approach rather than business thinking.
PhD in a relevant discipline and participation in community initiatives like Kaggle is a good thing but it shouldn’t be mandatory hiring criteria. That said, they absolutely have to know what’s going on in their field.
Your first hire should have enough breadth of knowledge about Data Science field in general.
It’s possible you may need a “Data Engineer” first, who is strong in Software Engineering, Data Storage, Extraction and Management instead of Mathematics and Statistics like a Data Scientist.
3. Define What You Want Them To Do
Data Scientists are driven by the application of their expertise to solve complex problems while utilising and contributing to academic research.
Job Descriptions focusing on the company instead of the problems and use cases, or the ones providing no or generic information don’t attract much attention. E.g. Instead of talking about your billion-dollar insurance company that might attract MBAs, you should talk about use cases like automated inspection of car damage, increasing the accuracy of busting bogus claims etc. to attract Data Scientists.
Looking for a person who knows all things Data Science is equivalent of looking for a unicorn proficient in Business, Technology, Maths, Modelling, Programming and Statistics.
You should account for overall skills, not just hard technical skills. They need to work as part of your team and must fit within the company culture and align with your values.
4. Build A Data-Driven Culture That Favours Data Scientists
It’s vital to look inwards and assess if your company culture will work for or against Data Scientists. Some red flags include not having any technical or business person with a basic understanding of Data Science, or not having any data and technical support in place.
If the company doesn’t have experience working with data and don’t make data-informed decisions, Data Scientists will struggle to convert their work into business and customer value.
You also need to work out the overall budget to support a Data Scientist or a team. AI startups with a huge VC-funded war chest may not have to worry about this. For corporates, given the scale of data and problems you want to address, these costs can run into hundreds of thousands, which could bring your well-intended efforts to a standstill. Open source software and Data Science tools are bringing some of these costs down.
Companies who fail in their AI efforts, and many of them do, end up blaming Data Scientists for poor skills whereas, in reality, it might be them who failed to provide the right support, environment, budget and team.
5. Decide Where and How They Will Fit Within Your Organisation
You need an internal leader from Product, Technology or Business function to steer, support and lead Data Science initiatives.
The Data Science team should be wrapped in a function to build cadence, accountability and continuous feedback loop from the rest of the business.
The inevitable question of centralised vs decentralised team comes up sooner or later. There are a number of factors like your current processes, organisation model, data maturity and team size that would sway the decision in favour of one approach over another.
For a scale-up AI startup, you may find many Data Scientists in one team. For technology behemoths like Google and Facebook, you may find a Data Scientist per product team for data-heavy products. For corporates, you may find a centralised Data Science team accessible to all other teams. For Agile organisations, you may find a data science squad with a focus on AI/ML product.
Whatever model you decide to go with, keep the flexibility to try and evolve as you build up the Data Science function.
6. Have A Supporting Or Dedicated Product Manager
Given a portion of PhD research is done in isolation, some business leaders believe a Data Scientist does the majority of their work in isolation (picture a scientist locked in a room scribbling on a blackboard). This couldn’t be further from the truth. Data Scientists need a team of engineers, business stakeholders, project managers and other team members to deliver projects.
You should assign a dedicated or supporting Product Manager to the business problem/opportunity Data Scientist is addressing. Product Managers can take care of countless tasks like requirements gathering, customer insights, data analysis, operating model, legal guidelines, delivering a product, pooling resources and getting things to production. Data Scientist should be involved in a number of these activities but if there is no support, you won’t get much value out of their efforts.
Few Data Scientists with past experience of working with Product Managers and Scrum Masters are bullish of this idea, saying:
Product Managers want to claim a lot of credit without adding any value.
The statement could be true for some but not all. Majority of Product Managers have never delivered Data Science projects before. They are learning and figuring things out as the field and art of Data Science product management is evolving.
Product Managers can set up alignment and expectations across the business, and can be pretty valuable in removing the roadblocks.
The conclusion is there must be a clear role alignment between a Product Manager and a Data Scientist, and what each member is bringing to the table for this partnership to work.
Each one of the above points can be a long-form post in itself. There is so much to unravel around costs, processes, skills, team, operating model and so on.
Let me know your thoughts in comments and if you want to hear more on anything specific.
Thanks to Jakub Langr for reviewing the draft and providing inputs.