Lorem ipsum dolor sit amet, consectetur adicing elit ut ullamcorper. leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet. Leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet.

  /  Project   /  Blog: Drive AI — Drive Well, Drive Safe

Blog: Drive AI — Drive Well, Drive Safe

Drive AI is an online web application developed by the team members of MEAN Machines during the IEEE-NSUT hackathon — hackNSUT’19. Drive AI provides a 3 fold AI facility for your vehicle:-

  • Drowsiness Detection — Alarm Notification
  • Mood Recognizer using Facial Expressions — Music Player
  • Object Detection and Distance Estimation — Alarm Notification

Angular 7 + Python 3 + Flask

The frontend was handled using the Angular 7 frontend framework. The backend was handled with Flask. To handle the Cross Origins error for running server and client side, CORS was used.

Drowsiness Detection

This feature was performed using the OpenCV library of python. Following steps are performed for implementing this feature:-

Calculating Eye Aspect Ration(EAR) from facial landmarks
  • Capture facial landmarks and features
  • Mathematically estimate the Eye Aspect Ratio in real time frames
  • Monitor the variation in EAR over a select number of frames and buzz the alarm when the set condition is met
def eye_aspect_ratio(eye):
# compute the euclidean distances between the two sets of
# vertical eye landmarks (x, y)-coordinates
A = dist.euclidean(eye[1], eye[5])
B = dist.euclidean(eye[2], eye[4])
# compute the euclidean distance between the horizontal
# eye landmark (x, y)-coordinates
C = dist.euclidean(eye[0], eye[3])
# compute the eye aspect ratio
ear = (A + B) / (2.0 * C)
# return the eye aspect ratio
return ear

Mood Music

About detecting anything using some Machine Learning Model, the most important aspect is dataset on which we’ve trained our model on. The fer2013 dataset was used which was published on the International Conference on Machine Learning 5 years ago, to recognize the facial expression. fer2013 is an open-source dataset which is first, created for an ongoing project by Pierre-Luc Carrier and Aaron Courville, then shared publicly for a Kaggle competition, shortly before ICML 2013. This dataset consists of 35,887 grayscale, 48×48 sized face images with 7 various emotions, all labeled.

Emotions labeled in the fer2013 database are:

0: 4593 images- Angry
1: 547 images- Disgust
2: 5121 images- Fear
3: 8989 images- Happy
4: 6077 images- Sad
5: 4002 images- Surprise
6: 6198 images- Neutral

Now starting with the Machine Learning concepts used in detecting Facial Expressions. I’ve used CNN architecture inspired by ResNet50 architecture with almost 7 lakhs trainable parameters. Complete creating model and training part is done on Google Collaboratory.

def Unit(x,filters,pool=False):
res = x
if pool:
x = MaxPooling2D(pool_size=(2, 2), padding=”same”)(x)
res = Convolution2D(filters=filters, kernel_size=[1, 1], strides=[2, 2], padding=”same”)(res)
out = BatchNormalization()(x)
out = Activation(“relu”)(out)
out = Convolution2D(filters=filters, kernel_size=[3, 3], strides=[1, 1], padding=”same”)(out)
out = BatchNormalization()(out)
out = Activation(“relu”)(out)
out = Convolution2D(filters=filters, kernel_size=[3, 3], strides=[1, 1], padding=”same”)(out)
out = add([res,out])
return out
def makeModel(input_shape):
images = Input(input_shape)
net = Convolution2D(filters=64, kernel_size=[3, 3], strides=[1, 1], padding=”same”)(images)
net = Unit(net,64,pool=True)
net = Unit(net,64)
net = Unit(net,128,pool=True)
net = Unit(net,128)
net = BatchNormalization()(net)
net = Activation(“relu”)(net)
net = Dropout(0.25)(net)
net = AveragePooling2D(pool_size=(2,2))(net)
net = Flatten()(net)
net = Dense(units=7,activation=”softmax”)(net)
model = Model(inputs=images,outputs=net)
return model
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
hist = model.fit(data,label,epochs=20,shuffle=True,batch_size=512,validation_split=0.25)

After the facial expression has been detected, this expression is used to select a song of that category and played. At the end of the song, the same process is repeated.

Object Detection, Distance Estimation, and Alert Generation

  • The object detection was done using transfer learning on YOLO-tiny model. Only a selected number of labels was set to be detected which included Car, Bus, Truck, Train, Bicycle, Person and Bike.
Model architecture for YOLO-tiny
  • The distance estimation part was done using OpenCV library of python. We followed the triangle similarity rule used in the pigeon hole principle. We could have used the distance estimation by lane detection but it could have created a problem on roads without lanes. The triangle similarity goes something like this: Let’s say we have a marker or object with a known height H. We then place this marker some distance D from our camera. We take a picture of our object using our camera and then measure the apparent width in pixels P. This allows us to derive the perceived focal length F of our camera as F = (P x D) / H. I can apply the triangle similarity to determine the distance of the object to the camera as D’ = (H x F) / P.
Distance estimation using Triangle Similarity (Using the width of the object)
  • For objects closer than 5 meters, an audio sound of an alarm will be played to alert the driver. The results can be seen below:-
Detecting car and person along with estimating the distance. No alarm triggered
Alarm triggered

Why Drive AI?

  • A simplified yet optimal safety tool that utilizesfast ML/AI algorithms
  • Requires no explicit interaction from user end
  • Mood-based music player improves overall UX and adds to the functionality
  • The USP — Convenience, Functionality & State-Of-The-Art Technology
    We aim to develop a local yet efficient software solution that can create an impact in the community and help propagate the novel uses of ML/AI

Tech Stack

  • Front End: Angular 7, Bootstrap, HTML5, SCSS, Typescript
  • Back End: Flask-RESTful, Python, OpenCV, Numpy, Scipy, Pandas, Sklearn, Keras, Tensorflow, MongoDB Atlas
  • IDEs and Other Resources: Jupyter Notebook, VS Code, Google Collaboratory, Sublime Text, Kaggle

Roadmap For Future

  • Completing the optimized Progressive Web App (PWA) to enhance feasibility and testing of the product
  • Integrating the 3 features with portable hardware: Raspberry Pi & USB cameras on the dash
  • Making the project open source and available to the community for further development


An mdx presentation was created for the project. The link for the same can be found below:-


Source: Artificial Intelligence on Medium

(Visited 2 times, 1 visits today)
Post a Comment