Data science is becoming popular everyday. In fact, data science is the sexiest job of 21st century according to Harvard business review.

Data science helps us to extract knowledge from large amount of data (Big Data) and insights from structured and unstructured data. You can learn more about what the data science is here.

To learn data science, the very first prerequisite is either Python or R programming language. So let’s learn python..

Anaconda is the best open-source distribution for the data science, data analysis and machine learning. Anaconda is used for both R and Python programming. And I highly recommend you to install that on your system. And here is the guide for installation. If not that, then you can also use PyCharm.

Hello, Python!

print("Hello, Python!") 

The above code is going to print Hello, Python!

Numbers and Math

Let’s do some Maths!

print("Hens", 25 + 30 / 6)
print("Roosters", 100 - 25 * 3 % 4)
print("Now I will count the eggs:")
print(3 + 2 + 1 - 5 + 4 % 2 - 1 / 4 + 6)
print("Is it true that 3 + 2)


With comments it is easier for the code reader to understand the code more better, and commenting is also used for code documentation

# hello world
# print("hello world")
# this is comment

Just put hash (#) sign before any line and it turns that line into comment in python

Variable Assignment

In Python, a variable allows you to refer to a value with a name. To create a variable use =, like this example:

x = 5

You can now use the name of this variable, x, instead of the actual value, 5.

Calculations with variables

Instead of calculating with the actual values, you can use variables instead.

# suppose we want to calculate your age
my_birth_year = 1995
present_year = 2019
my_age = my_birth_year - present_year

To learn more about variable and its types read here

Lists in python

As opposed to int, bool etc., a list is a compound data type; you can group values together:

a = "is"
b = "nice"
my_list = ["my", "list", a, b]
# output: my list is nice

Subsetting Python lists is a piece of cake

x = ["a", "b", "c", "d"]
# here the index of "a" is 0, and "b" is 1, and so on..
# so x[1] is "b"

You are gonna love slicing and dicing in python list

my_list = ["a", "b", "c", "d", "e", "f"]
# output: a, b, c
# index 3 is excluded

Numpy arrays in python

NumPy is a Python package to efficiently do data science. Learn to work with the NumPy array, a faster and more powerful alternative to the list, and take your first steps in data exploration.

# Create list baseball
baseball = [180, 215, 210, 210, 188, 176, 209, 200]
# Import the numpy package as np
import numpy as np
# Create a numpy array from baseball: np_baseball
np_baseball = np.array(baseball)
# Print out type of np_baseball

np.array() is used to create a numpy array from baseball. Named as np_baseball.

This is just the tip of the ice-berg, python programming language is full of amazing libraries which are used for data science, machine learning and other artificial intelligence sub-fields. If you are interested in learning data science, first take python courses from youtube, udemy or coursera. There are tons of free resources available to help you learn python programming.

Source: Artificial Intelligence on Medium