5 Learnings from Strata Data Conference in London
Amongst technology conferences, O’Reilly’s Strata Data is large and a highly regarded one in the data community. I’ve always looked up to the event and was excited to be accepted to speak at Strata London in the past week. Here’s a roundup of the event and some key learnings.
With carefully curated content from academia, enterprises, and researchers there was a lot to look forward to. O’Reilly events are known for their exceptional focus on diversity, and the 4-day event brought together an interesting collection of talks and speakers from around the world.
There was a mix of different session formats to serve a little bit of everything: 16-hour trainings, 4-hour workshops, business case studies, executive briefings, and technical deep-dives. The conference chairs were deeply knowledgeable & easily accessible, which made the hallway chats enjoyable.
Here’s a recap of learnings from the top trends covered in the conference. I’ve linked to the most interesting ones that also have public session decks. Given the crazy number of parallel track sessions (13 when I counted, more on that below), this listing is biased based on the ones that I could attend or track.
1. What’s holding back AI adoption in Enterprises?
Enterprises are struggling with the adoption of AI, and this was a recurring theme across tracks including the keynote by Ben on ‘Sustaining machine learning in the enterprise’. From a recent O’Reilly survey, here are top bottlenecks that are holding back the adoption:
While company culture can be a non-starter, I’ve seen the top 5 in this list across clients. There were sessions covering each of these challenges, including a separate track on enterprise case studies. Few noteworthy sessions: ‘Why managing machines is harder than you think’ by Pete Skomoroch, and Shingai Manjengwa’s on the next best alternative to hiring unicorn data scientists.
The keynote by Cait O’Riordan on how the Financial Times adopted data science to hit 1 million paying subscribers a year ahead of their target, was outstanding.
2. Data governance: Life in the shark tank for CDOs
“Data governance is to data assets, as HR is to people” was a good analogy from Paco Nathan in his session on Overview of Data Governance. Starting with a historical evolution of data and enterprise architectures, it covered the issues facing different segment of companies today and provided a future outlook.
Sundeep Reddy’s session offered a contrasting perspective of public governance and data, by looking at India’s booming digital economy and the dilemma faced by a billion people navigating the public rollout of India stack.
3. Why is it so hard to do Good with AI?
A session provocatively titled, ‘Using data for evil’ drew large crowds and was the fifth part in a series by Duncan Ross. Complemented with a session on AI for Good, he shared examples from DataKind UK on doing good with data. The dangerous contributions of AI to fake news was covered in Alex Adam’s session on Synthetic video generation and on how it can be detected.
I spoke on how AI can save our planet’s biodiversity, using case studies from Gramener’s work with Microsoft AI for Earth. The 4 examples showed how NGOs are using deep learning solutions to detect, identify, count & protect endangered species.
4. The unreasonable effectiveness of Natural Language Processing
Who best to talk about the latest in text analytics than Mathew Honnibal, the creator of the wildly popular open-source NLP library on Python, Spacy. He shared tips on improving the success of NLP projects. The session on predicting policy change in China using just the text archives of ‘People’s Daily’ was very intriguing for its simple, but sound methodology.
Scarcity of data or labeled corpus is a major challenge in NLP, and Yves Peirsman’s session on Dealing with data scarcity in NLP had some useful tips on handling this.
5. Explainability and opening up of the black box models
No conversation in AI today is complete without talking about interpretability and explainability. The session by Eitan Anzenberg explained the need for explainability and covered frameworks like LIME. Yiannis Kanellopoulos covered case studies on accountability of models in Fintech.
Apart from these themes, there were technical sessions covering several aspects of streaming data processing, implementation, and productionization of ML algorithms on popular platforms/packages such as Spark, Tensorflow, AWS, Google Cloud, Azure, R.
Sessions of such variety meant running 13 parallel tracks through the day, and the printed agenda was an unwieldy, extra-wide landscape sheet! While this gave an interesting mix of topics, picking one posed a big dilemma. This problem of plenty also had the negative effect of reducing engagement levels (and attendance) in the sessions.
There were ample networking opportunities with this highly engaged audience of practitioners. AI was not just a part of the sessions, but was on the exhibit floor, with drinks dished out through the day by Makr Shakr’s robotic bartender!
Looking to speak in upcoming O’Reilly events?
If you’re wondering where to get started for an invite to one of the upcoming O’Reilly events, this ebook is a useful resource. Written by Alistair Croll, the person behind Strata, it talks about the way events managed, how talks are selected, and what causes heartburn to the organizers. It offers an inside view on what it takes to submit a quality entry and get picked. A useful, quick read I’d recommend.