ProjectBlog: Classifying MNIST using a Convolution Neural Network with MXNet

Blog: Classifying MNIST using a Convolution Neural Network with MXNet

MNIST Visualization


In this article, I’m going to show you how to classify MNIST dataset, a collection of images of handwritten digits(0–9), Often considered the “Hello, World” of deep learning.If you have no prior experience in deep learning or know nothing about Convolutional Neural networks,this would be a perfect place to start.

I am going to implement a CNN with Apache MXNet.This article will help you to understand the basic structure and implementation of Neural Network’s in MXNet.

MXNet — Brief Introduction

Apache MXNet Logo.

Apache MXNet is an open-source deep learning software framework, used to train, and deploy deep neural networks. It is scalable, allowing for fast model training, and supports a flexible programming model and multiple programming languages (including C++, Python, Julia, Matlab, JavaScript, Go, R, Scala, Perl, and Wolfram Language.)

The MXNet library is portable and can scale to multiple GPUs[1] and multiple machines. MXNet is supported by public cloud providers including Amazon Web Services (AWS)[2] and Microsoft Azure.[3] Amazon has chosen MXNet as its deep learning framework of choice at AWS.[4][5] Currently, MXNet is supported by Intel, Dato, Baidu, Microsoft, Wolfram Research, and research institutions such as Carnegie Mellon, MIT, the University of Washington, and the Hong Kong University of Science and Technology.[6]

Source: Wiki

Let’s get started…

code used in this article can be found in my github repository.

Jupyter notebook


Mxnet can be installed using pip:

pip install mxnet

If you have a GPU,you can install GPU version of mxnet:

pip install mxnet-cuXX

where XX is your CUDA version.For example,I have CUDA 10 installed on my PC so i run:

pip install mxnet-cu100
  • You would need to following packages to follow along with me:
pip install numpy scikit-learn pandas seaborn matplotlib

The dataset used here can be downloaded from Kaggle:

Let’s start by importing required packages:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import seaborn

Load our train dataset:

#load our train dataset.
train = pd.read_csv("~/datasets/mnist/train.csv")

Split our train into X(image data) and Y(label), reshaping it into 28×28

#Splitting train dataset into X and Y.Normalizing it by dividing it with 255
X = train.iloc[:,1:].values.reshape(-1,28,28) / 255
Y = train.iloc[:,0].values

Split our data into train and validation set:

from sklearn.model_selection import train_test_split

trn_x,val_x,trn_y,val_y = train_test_split(X,Y,test_size=0.2)

The Network:

This is how we define a CNN in MXNet

(cnn1): Conv2D(None -> 20, kernel_size=(4, 4), stride=(1, 1))
(mxp1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
(cnn2): Conv2D(None -> 25, kernel_size=(2, 2), stride=(1, 1))
(mxp2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
(cnn3): Conv2D(None -> 30, kernel_size=(2, 2), stride=(1, 1))
(mxp3): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False)
(flat): Flatten
(fc): Dense(None -> 128, linear)
(out): Dense(None -> 10, linear)

below block of code will select CPU or GPU(if available,preferred) and initialize our network.

device = mx.gpu(0) if mx.context.num_gpus() > 0 else mx.cpu(0)
cnn.initialize(mx.init.Xavier(), ctx=device)

We define ‘trainer’,which manages optimization of our network

trainer = gluon.Trainer(
optimizer_params={'learning_rate': 0.01},

we define few important variables that we will use during training to calculate loss,accuracy and we also define loss for our model.

accuracy_fn = mx.metric.Accuracy()
loss_function = gluon.loss.SoftmaxCrossEntropyLoss()
ce_loss = mx.metric.CrossEntropy()

Convert numpy array into mxnet ndarray.

#converting our numpy array into mxnet.nd.array

trn_x = nd.array(trn_x).reshape(-1,1,28,28)
trn_y = nd.array(trn_y)

val_x = nd.array(val_x).reshape(-1,1,28,28)
val_y = nd.array(val_y)

Now,train our model:

define a small to function to reset values after we have calculated metrics:

def reset_metrics():

For every epoch,we print metrics to see how good our model is should see something like this as output:

epoch: 1 | trn_loss: 1.091759033203125 | trn_acc: 0.654 | val_loss: 1.0010985107421875
epoch: 2 | trn_loss: 0.434418212890625 | trn_acc: 0.871 | val_loss: 0.44930450439453123
epoch: 3 | trn_loss: 0.27909747314453126 | trn_acc: 0.911 | val_loss: 0.24485662841796876
epoch: 4 | trn_loss: 0.20936044311523438 | trn_acc: 0.931 | val_loss: 0.20494595336914062
epoch: 5 | trn_loss: 0.1736647491455078 | trn_acc: 0.937 | val_loss: 0.16943218994140624
epoch: 6 | trn_loss: 0.15489833068847655 | trn_acc: 0.949 | val_loss: 0.14828227233886718
epoch: 7 | trn_loss: 0.14918743896484374 | trn_acc: 0.952 | val_loss: 0.1260661087036133
epoch: 8 | trn_loss: 0.12303134155273437 | trn_acc: 0.961 | val_loss: 0.11447244262695312
epoch: 9 | trn_loss: 0.07636952972412109 | trn_acc: 0.975 | val_loss: 0.07003825378417969
epoch: 10 | trn_loss: 0.08552651977539062 | trn_acc: 0.971 | val_loss: 0.07812557983398437
epoch: 30 | trn_loss: 0.020007244110107424 | trn_acc: 0.994 | val_loss: 0.02010601806640625

You can calculate validation accuracy by running the following lines:

pred = cnn(val_x.as_in_context(device))
predictions = []

for p in pred.asnumpy():
from sklearn.metrics import accuracy_score

acc = accuracy_score(val_y.asnumpy(),predictions)



Accuracy: 98.1785714286 %


You just learned how you can define and train a neural network with MXNet and classify MNIST with accuracy of over 98%

Source: Artificial Intelligence on Medium

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top

Display your work in a bold & confident manner. Sometimes it’s easy for your creativity to stand out from the crowd.