 Lorem ipsum dolor sit amet, consectetur adicing elit ut ullamcorper. leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet. Leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet.

/  Project   /  Blog: 5 Powerful Scikit-Learn Models

## Herein lies just enough information to make you deadly; 5 models you can learn and apply to become a competent practitioner of Machine Learning.

Below are 5 models. For each model we observe the model’s prediction that an Iris is/isn’t a Virginica. We then perform classification. It’s fascinating to observe how these models differ. We will not spend much time going into detailed math for each example. Our whole goal is to highlight some useful models you can add to your toolbox and highlight some of their differences. In a subsequent article we will discuss pros and cons. No point in ranting, let’s jump right in.

Note: Full code for all examples is here.

### 1. Logistic Regression

To explain Logistic Regression we can start by explaining the Sigmoid function. Sigmoid is at the core of Logistic Regression. Fundamentally, we take input and output a value between 0 and 1. Our output P(x) is the probability that our dependent variable equals a case. In the following output we get the probability that a certain set of observations is or isn’t a Virginica. We can see that as our petal width feature increases the probability of being a Virginica increases.

`log_reg = LogisticRegression(penalty="l2")log_reg.fit(X,y) X_new = np.linspace(0,3,1000).reshape(-1,1)y_proba = log_reg.predict_proba(X_new)`

We can extend Logistic Regression to multiple classes and it turns out to be very powerful. In this example we can see a very low classification error amongst the three classes.

`softmax_reg = LogisticRegression(multi_class="multinomial", solver="lbfgs", C=5)softmax_reg.fit(X,y)pred = softmax_reg.predict(X_test)`

### 2. Support Vector Machines

Support Vector Machines work by attempting to pass a hyperplane through the dataset, capable of classifying the data. This can be done on various dimensions. Check out this article if you’re interested in diving deep into the various details.

`clf = svm.SVC(gamma='scale', decision_function_shape='ovo')clf.fit(X,y) X_new = np.linspace(0,3,1000).reshape(-1,1)y_proba = clf.predict_proba(X_new)`
`clf = svm.SVC(gamma='scale', decision_function_shape='ovo')clf.fit(X,y)pred = clf.predict(X_test)`

### 3. Naive Bayes

Perhaps the simplest of all the models discussed in this article, we make it now to Naive Bayes. Naive Bayes is great for the small amount of data necessary to estimate parameters. Naive Bayes applies Bayes’ theorem and is called naive because of the assumption of conditional independence between each feature. In this example I apply Gaussian Naive Bayes:

`clf = GaussianNB()clf.fit(X,y) X_new = np.linspace(0,3,1000).reshape(-1,1)y_proba = clf.predict_proba(X_new)`
`clf = GaussianNB()clf.fit(X,y)pred = clf.predict(X)`

### 4. Random Forest

Random Forest is a popular ensemble model used quite frequently. You can see ensemble models popping up all over the place, especially in Kaggle competitions. Random forest works by fitting decision tree classifiers on subsamples of the dataset. It then averages classification performance to garner superior accuracy whilst avoiding overfitting. We set n_estimators to 100 which sets the number of trees in the forest to 100. Max depth sets the maximum depth of the tree.

`clf = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)clf.fit(X,y) X_new = np.linspace(0,3,1000).reshape(-1,1)y_proba = clf.predict_proba(X_new)`
`clf = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)clf.fit(X,y)pred = clf.predict(X)`

Another popular ensemble model…AdaBoost works to fit many classifiers on the dataset with different weights for incorrectly classified instances. AdaBoost training selects the features known to increase the classification power of the model. This of course acts as dimension reduction, which is a plus as long as classification capabilities are preserved.

`clf = AdaBoostClassifier(n_estimators=100)clf.fit(X,y) X_new = np.linspace(0,3,1000).reshape(-1,1)y_proba = clf.predict_proba(X_new)`
`clf = AdaBoostClassifier(n_estimators=100)clf.fit(X,y)pred = clf.predict(X)`

### Congrats

Hooray! You made it to the end. Now it’s your job to ask questions and try to understand these models on a deeper level. In the next article I will dive into the pros and cons of each model. Until next time…

Some more Scikit-Learn examples: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html