Blog: Bridging the gap between research and big tech: applied AI/ML best practices for the modern…
Authors: Devvret Rishi, Margaret Jennings
- It’s intimidating to apply AI within an enterprise context, and for many it’s difficult to know how to start
- Our team have created a simple framework to help others begin the process
- The framework has been developed over the course of 6 months working with 10 of the world’s leading fintech startups to help bridge the gap between research, big tech, and the startup environment
Bridging the gap between research and tech
Rising Industry Interest
With the buzz around AI/ML, there has been an influx of interest to leverage the best-in-class technology. From data-first products to staying ahead of the competition, AI/ML presents an opportunity to modernize one’s business operations and create new value as a company.
But it can be intimidating to start.
Over the past six months, we set out to bridge the gap between research and big tech by working closely with the world’s leading fintech startups on applied AI. The cohort was comprised of a diverse set of startup compositions and business models — from 10-person teams to multi-billion dollar companies — as well as spanning the globe with 10 countries and four continents represented.
Each startup was paired with a Google expert on a scoped project, with dedicated PMs, data science, and eng teams to support the development and deployment of the ML models. Our goal was to help both with the framing of the problem as it relates to business value and with the technical components.
Over the course of six months, we developed a framework based on common experiences and advice from both the startups and Google research scientists, product managers, and data engineers. The framework sets out to help others think through applied AI projects and to help ourselves apply theory to reality — taking into context that oftentimes the world is more dynamic, messy, and constrained than we originally suspect.
In our upcoming blog series, rather than talk about the theory of AI/ML, we want to share some of the startup stories so we can all learn by practical example.
Why we believe in applied AI/ML
Follow Google’s example on how to become a machine learning first company
As Sundar Pichai, Google’s CEO has noted, Google has transitioned to apply machine learning across its products and infrastructure. Whether it’s cutting edge natural language understanding on the Google Assistant or highly personalized recommendation engines on YouTube, applied ML has been critical to the success and cutting edge of its products. Through these experiences, Google has developed a deep expertise in:
- Data pipelines
- Model development & evaluation
- Deployment at scale
- Continuous evaluation and iteration
Google benefits as more players get sophisticated in their technology adoption (think the rise of online advertising, or adoption of platforms like Android and Chrome) and has sought to externalize their techniques as well.
Google has had a long history of publishing research, starting with the first paper by Larry and Sergey, The Anatomy of a Large-Scale Hypertextual Web Search Engine. From there, more publications have been put out every year and Machine Intelligence is the largest category of publications on Google’s research blog, with 1400+ articles. But in addition to thought leadership, it was also crucial to democratize AI by evening the playing field in terms of tooling. A few examples of our approach include:
Tensorflow — released and open sourced in 2015, it allowed engineers around the world to create more complex models, like neural nets, using advanced techniques created by Google Brain. The ecosystem and adoption of such tooling really took off in the years afterwards, and new frameworks also entered like PyTorch and Keras.
Cloud AI Hub then emerged as a comprehensive way to be able to serve the broadening needs for ML, targeting both more sophisticated users as Tensorflow did as well as enterprises still undergoing the digital revolution who need a more aided tool. Today, using Cloud AI Hub you can find data to build an initial model for your task or store your own in a service like BigQuery, use AutoML to find the best model for the data & evaluation function, and then deploy + monitor with Cloud Machine Learning Engine
Even with these increasingly powerful and usable toolset increases, we see each company has a set of challenges they run into when undertaking an ML journey.
What we learned: three simple steps to improve success
We believe that AI/ML talent is global, but expertise is limited. Our hope for the series is to inspire both the developer and the entrepreneurial community to experiment with applied AI/ML and create the next generation of AI-first platforms. As our friend Sara Hooker says, “many of the problems that are currently open and unsolved in the world can benefit from simply a rigorous framework of exploration, strong hypotheses, and good old fashioned feature engineering.”
Bucketing the challenges each company in our cohort experienced, we saw a 3-step progression of how companies started to adopt AI/ML, and want to offer a few real-life experiences to motivate each of the three steps we saw.
1.Framing the Problem — Getting your organization ready for the task and understanding your data at a fundamental level to be able to articulate the business value for AI/ML in its context
- Choose a problem that matters to your company’s bottom line
- Bring together a diverse set of experiences and domain experts to help you frame the problem as well as ensure that the data exploration, model creation, model deployment, and its outputs are accurate and will matter for the long-run
- Educate your peers on what applied AI/ML is and is not; brown bag lunches can be a great opportunity to learn and grow as a company on this new research area
2. Building a Model — “Don’t be a hero”. Start simple and prove that the model and value proposition are worthy of your time, capital, and mindshare.
- We can’t stress enough the importance of starting simple, with a clear business goal tied to the model’s output
- Collecting data, cleaning data, augmenting your relatively small dataset, and aligning on a common business value will take the majority of your time
- Don’t oversell the magic of AI/ML to internal stakeholders. Instead, try to build a dead simple model that you can quickly determine directional value from
- The data exploratory process is long and winding — that’s why having a diverse pod to accelerate your development will help identify the dead ends and ensure you’re on the right path for the long run
- At the beginning, quickly map what types of data you have available today and what types of data could help serve you in the long run
3. Deploying, Measuring, Monitoring — You’ve now developed a dynamic product that is utilizing live data
- Early testing of your model can help identify larger gaps in your original assumptions; be sure to keep track of your progress with a clear deployment log that’s useful to not only yourself and pod but to your company as a whole
- Segment the data and test its biases before it is ingested by your model to ensure you have a representative dataset from the beginning
- Monitor your model carefully in deployment — live data can bring up larger issues where you may see unexpected deviations that allowed you to overestimate your maximum model accuracy
To better illustrate these points, we’ll be following this introductory blog post with a series of case studies from GOJEK, Frontier Car Group, Celo, and others that will dive into the details of each of the facets we’ve discussed here.
Dev Rishi is a Product Manager on Google Cloud AI, working to democratize machine learning to more use cases. He graduated with his Bachelors and Masters in Computer Science from Harvard University.
Margaret Jennings leads part of the Launchpad Studios program focused on applied AI & ML for startups and has previously worked in venture investing.