a

Lorem ipsum dolor sit amet, consectetur adicing elit ut ullamcorper. leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet. Leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet.

  /  Project   /  Blog: DS/AI-SSH: Utilizing the Resources

Blog: DS/AI-SSH: Utilizing the Resources


Photo by Markus Spiske on Unsplash

In the last post we learnt how to work on the building blocks in order to work on Data Science.

This is the 5th post of blog post series ‘DS/AI: Self-Starter Handbook’, this post outlines the resources (mainly free except books) that DS/AI starters should refer. Following topics are covered here:

  • Books to refer
  • Courses to attend
  • Blogs to follow
  • Podcasts to listen
  • YouTube channels
  • Useful GitHub Repos

After getting to know DS/AI landscape and its building blocks in previous posts, you may ask which are the resources to refer, how do I know if a book or course is worth to spend time and/or money?

This post is my attempt to make your task easier. I am listing down major quality resources (mostly free) here and also going to provide you my critical review of these resources, which will help you to take informed decision.

Books to refer

Machine Learning with R

This is an excellent book for the R starter who wants to apply ML to any kind of project. All the main ML models are presented, as well as different performance metrics, bagging, pruning, tuning, ensembling etc. Easy to scan through, many tips with fully-solved textbook problems. Certainly a very good starting point if you plan to compete on Kaggle. If you already master both R and ML, this books is obviously not for you.

Python Machine Learning

This is a fantastic introductory book in machine learning with python. It provides enough background about the theory of each (covered) technique followed by its python code. One nice thing about the the book is that it starts implementing Neural Networks from the scratch, providing the reader the chance of truly understanding the key underlying techniques such as back-propagation. Even further, the book presents an efficient (and professional) way of coding in python, key to data science.

ISLR

The book explains concepts of Statistical Learning from the very beginning. The core ideas such as bias-variance trade-off are deeply discussed and revisited in many problems. The included R examples are particularly helpful for beginners to learn R. The book also provides a brief, but concise description of functions’ parameters for many related R packages. Compared to The Elements of Statistical Learning, it is easy for the reader to understand. It does a wonderful job of breaking things down complex concepts. If one wishes to learn more about a particular topic, I’d recommend The Element of Statistical Learning. These two pair nicely together.

Deep Learning

This is the book to read on deep learning. Written by luminaries in the field — if you’ve read any papers on deep learning, you must have heard about Goodfellow and Bengio before — and cutting through much of the BS surrounding the topic: like ‘big data’ before it, ‘deep learning’ is not something new and is not deserving of a special name. Networks with more hidden layers to detect higher-order features, networks of different types chained together in order to play to their strengths, graphs of networks to represent a probabilistic model.

This is a theoretical book, but it can be read in tandem with Hands-On Machine Learning with Scikit-Learn and TensorFlow, almost chapter-for-chapter. The Scikit-Learn and Tensorflow example code, while only moderately interesting on its own, helps to clarify the purpose of many of the topics in the Goodfellow book.

Hands-On Machine Learning with Scikit-Learn and TensorFlow

This book provides great introduction into machine learning for both developer and non developers. Authors suggest to just go through even if you don’t understand math details. Highlights of this book are:

The book demonstrate (including the code) different approaches using Scikit-Learn python package and also the TensorFlow.

Data Science for Business

This is probably the most practical book to read if you are looking for an overview of data science. Either you know when terms like k-means and ROC curves are to be used or you have some context when you start digging deeper into how some of these algorithms are implemented. You will find it at right level because there is just enough math to explain the fundamental concepts and make them stick in your head.

This isn’t a book on implementing these concepts or a bunch of algorithms. This gives the book the advantage of being something you can refer to an intelligent manager or interested developer, and they can both get a lot out of it. And if they are interested in the next level of learning there are plenty of pointers. You will also find the chapter on presenting results through ROC curves, lift curves, etc. pretty interesting. It would be cool if this book had some more hands on, but you can go to Kaggle and browse around the current and past competitions to apply what you learn here .

https://www.amazon.com/dp/1449361323/

Courses to attend

Machine Learning

Machine Learning is one of the first programming MOOCs. Coursera put online by Coursera founder and Stanford Professor Andrew Ng. This course assumes that you have basic programming skills and you have some understanding of Linear Algebra. Knowledge of Statistics & Probability is not required though.

Andrew Ng does a good job explaining dense material and slides. The course gives you a lot of structure and direction for each homework, so it is generally pretty clear what you are supposed to do and how you are supposed to do it.

https://www.coursera.org/learn/machine-learning

Deep Learning

When you are rather new to the topic, you can learn a lot of doing the deeplearning.ai specialization. First and foremost, you learn the basic concepts of NN. How does a forward pass in simple sequential models look like, what’s a backpropagation, and so on. I experienced this set of courses as a very time-effective way to learn the basics and worth more than all the tutorials, blog posts and talks, which I went through beforehand.

Doing this specialization is probably more than the first step into DL. I would say, each course is a single step in the right direction, so you end up with five steps in total. I think it builds a fundamental understanding of the field. But going further, you have to practice a lot and eventually it might be useful also to read more about the methodological background of DL variants. But doing the course work gets you started in a structured manner — which is worth a lot, especially in a field with so much buzz around it.

Fast AI

If your goal is to be able to learn about deep learning and apply what you’ve learned, the fast.ai course is a better bet. If you have the time, interleaving the deeplearning.ai and fast.ai courses is ideal — you get the practical experience, applicability, and audience interaction of fast.ai, along with the organised material and theoretical explanations of deeplearning.ai.

Kaggle Learn

Practical data skills you can apply immediately: that’s what you’ll learn in these free micro-courses.They’re the fastest (and most fun) way to become a data scientist or improve your current skills.

Blogs to follow

KD Nuggets

KDnuggets is a leading site on AI, Analytics, Big Data, Data Mining, Data Science, and Machine Learning and is edited by Gregory Piatetsky-Shapiro and Matthew Mayo. KDnuggets was founded in February of 1997. Before that, Gregory maintained an earlier version of this site, called Knowledge Discovery Mine, at GTE Labs (1994 to 1997).

Analytics Vidhya

Analytics Vidhya provides a community based knowledge portal for Analytics and Data Science professionals. The aim of the platform is to become a complete portal serving all knowledge and career needs of Data Science Professionals.

Towards Data Science

TDS joined Medium’s vibrant community in October 2016. In the beginning, their goal was simply to gather good posts and distribute them to a broader audience. Just a few months later, they were pleased to see that they had a very fast growing audience and many new contributors.

Today they are working with more than 10 Editorial Associates to prepare the most exciting content for our audience. They provide customized feedback to our contributors using Medium’s private notes. This allows them to promote their latest articles across social media without the added complexity that they might encounter using another platform.

Podcasts to listen

Data Hack

This is Analytics Vidhya’s exclusive podcast series which will feature top leaders and practitioners in the data science and machine learning industry.

So in every episode of DataHack Radio, they bring you discussions with one such thought leader in the industry. They have discussions about their journey, their learnings and plenty of other data science related things.

Super Data Science

Kirill Eremenko is a Data Science coach and lifestyle entrepreneur. The goal of the Super Data Science podcast is to bring you the most inspiring Data Scientists and Analysts from around the World to help you build your successful career in Data Science.

Data is growing exponentially and so are salaries of those who work in analytics. This podcast can help you learn how to skyrocket your analytics career. Big Data, visualization, predictive modeling, forecasting, analysis, business processes, statistics, R, Python, SQL programming, tableau, machine learning, hadoop, databases, data science MBAs, and all the analytcis tools and skills that will help you better understand how to crush it in Data Science.

YouTube Channels

Sirajology

Your host here is Siraj. He is on a warpath to inspire and educate developers to build Artificial Intelligence. Games, music, chatbots, art, he teaches you how to make it all yourself. This is the fastest growing AI community in the world. Their mission: Solve AI. Use it to benefit humanity.

He is an AI Researcher, his latest paper is here — https://drive.google.com/file/d/0BwUv84lNDk72Q1gzaXgwR2U3U2NWVlZSOFk4amZIRmV1QXI0/view

He is also a Data Scientist, AI Educator, Rapper, Author, and Director of the School of AI (www.theschool.ai)

Data School

Are you trying to learn data science so that you can get your first data science job? You’re probably confused about what you’re “supposed” to learn, and then you have the hardest time actually finding lessons you can understand! Data School focuses you on the topics you need to master first, and offers in-depth tutorials that you can understand regardless of your educational background.

Your host here is Kevin Markham, and he is the founder of Data School. He has taught data science using the Python programming language to hundreds of students in the classroom, and hundreds of thousands of students (like you) online. Finding the right teacher was so important to his data science education, and so he sincerely hopes that he can be the right data science teacher for you.

Caltech Machine Learning

This is an introductory course by Caltech Professor Yaser Abu-Mostafa on machine learning that covers the basic theory, algorithms, and applications. Machine learning (ML) enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML techniques are widely applied in engineering, science, finance, and commerce to build systems for which we do not have full mathematical specification (and that covers a lot of systems). The course balances theory and practice, and covers the mathematical as well as the heuristic aspects.

GitHub Repos

Awesome Data Science

This Repo answer the questions, “What is Data Science and what should you study to learn Data Science?” An awesome Data Science repository to learn and apply for real world problems.

As the aggregator says, “Our favorite data scientist is Clare Corthell. She is an expert in data-related systems and a hacker, and has been working on a company as a data scientist. Clare’s blog. This website helps you to understand the exact way to study as a professional data scientist.”

“Secondly, Our favorite programming language is Python nowadays for Data Science. Python’s — Pandas library has full functionality for collecting and analyzing data. We use Anaconda to play with data and to create applications.”

Essential Cheat Sheets for Machine Learning and Deep Learning Engineers

Machine learning is complex. For newbies, starting to learn machine learning can be painful if they don’t have right resources to learn from. Most of the machine learning libraries are difficult to understand and learning curve can be a bit frustrating. Kailash Ahirwar has created a repository on Github (cheatsheets-ai) containing cheatsheets for different machine learning frameworks, gathered from different sources. Have a look at the Github repository, also, contribute cheat sheets if you have any. Thanks.

HackerMath for Machine Learning

Math literacy, including proficiency in Linear Algebra and Statistics, is a must for anyone pursuing a career in data science. The goal of this workshop is to introduce some key concepts from these domains that get used repeatedly in data science applications.

As outlined by Amit Kapoor, “Our approach is what we call the ‘Hacker’s way’. Instead of going back to formulae and proofs, we teach the concepts by writing code. And in practical applications. Concepts don’t remain sticky if the usage is never taught.”

The focus here is on depth rather than breadth. Three areas are chosen — Hypothesis Testing, Supervised Learning and Unsupervised Learning. They are covered to sufficient depth — 50% of the time on the concepts and 50% of the time spent coding them.


Source: Artificial Intelligence on Medium

(Visited 20 times, 1 visits today)
Post a Comment

Newsletter