Modeling and Deploying a Ticket Classifier on AWS
Team — Abhilasha Kanitkar, Jitender Phogat, Pankaj Kishore
In this article, we will try to address a real business problem. In the IT world, most of the production issues are handled in the form of IT support tickets. We will first cover the dataset we used for the problem and which model we used and then how we deployed the model on AWS and was able to do real-time predictions. Let’s get started.
There are a few platforms which are used for handling the support tickets and keeps track of them like BMC Remedy, ServiceNow, etc. Whenever an issue is raised then somebody will assign the ticket to the most relevant team in his knowledge.
Most of the time ticket is not directly assigned to the correct team which is responsible for resolving that and keeps on rolling from team to team and before it reaches the relevant team it expires its SLA. Sometimes, Issues are critical and need immediate remediation and at that time it impacts productivity.
So, we thought of addressing this issue using Machine Learning and build a platform which will automatically assign the ticket to the teams and also has the ability to learn and improve over time.
This is how the overall workflow looks like.
We used Microsoft’s support ticket dataset which you can find here.
Next step in the project was to build a classifier model which can classify different tickets into the categories. We had previously done this classification using RNN and LSTM but for this time we chose to keep the model simple for simplicity as we wanted to integrate it with AWS also. This is the best practice most of the companies follow and also a good practice to start with a simple model.
We choose the Multinomial Naive Bayes model for classification. Naive Bayes is a family of algorithms based on applying Bayes theorem with an assumption, that every feature is independent of the others, in order to predict the category of a given sample. They are probabilistic classifiers, therefore will calculate the probability of each category using Bayes theorem, and the category with the highest probability will be output.
But, why Naive Bayes classifiers?
We do have other alternatives when coping with NLP problems, such as Support Vector Machine (SVM) and neural networks. However, the simple design of Naive Bayes classifiers makes them very attractive for such classifiers. Moreover, they have been demonstrated to be fast, reliable and accurate in a number of applications of NLP.
IV. Data Pre-processing
For text classification, if you are collecting your data yourself via scraping then you may have a messy dataset and have to put a lot of efforts in cleaning it and getting it in good form before applying any model. In our case, the dataset was not that messy so we need not put that much effort into this. So, we performed following very common but crucial data pre-processing steps –
- Lower case and removing stop words — Convert the entire input description to lower case and remove the stop words as they don’t add anything to the categorization
- Lemmatizing words — This groups together different inflections of the same words like organize, organizes, organizing, etc.
- n-grams — Using n-grams we can count the sequence of the words, Instead of counting single words
To perform classification we have to represent the input description in the forms of the vectors using the bag of words techniques. There are two approaches to perform this.
- Counting the number of times each word appears in a document
- Calculating the frequency that each word appears in a document out of all the words in the document
It works on Term Frequency, i.e. counting the occurrences of tokens and building a sparse matrix of documents tokens
TF-IDF stands for Term Frequency and Inverse Document Frequency. TF-IDF weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. The importance increases proportionally to the number of times a word appears in the document but is offset by the frequency of the word in the corpus.
Term frequency is the frequency of a word in a particular document.
Inverse Document Frequency gives us a measure of how rare a term is. Rarer a term is higher will be IDF score.
V. Model Fitting
Now we have the data prepared and will fit the Multinomial Naive Bayes to the data to train the model. We created a sklearn pipeline with all the pre-processing steps involved because we want to represent the new incoming ticket description in the form of vectors which we have after the model is trained and not going to create new ones based on the new description.
from sklearn.pipeline import Pipeline
text_clf = Pipeline([(‘vect’, CountVectorizer(ngram_range=(1,2),stop_words=stopwords.words(‘english’))), (‘tfidf’, TfidfTransformer()), (‘clf’, MultinomialNB()),])
After the model is created we tested the model performance on our test dataset and we were getting a pretty good 92.167% accuracy. Then we exported the model into a pickle file.
VI. Integration with AWS
Now comes the most important part of the project which was to deploy the model on AWS and configure AWS Lambda function to do real-time prediction. This is where we spent a lot of time.
First, what is AWS Lambda and why are we using it?
AWS Lambda is a serverless computing platform provided by Amazon. It lets us run the code without having to worry about the provisioning and managing the servers and resources. You have to define the event triggers when you want to run the lambda function and just upload your code and Lambda will take care of everything that is required to run and scale your code. And one advantage of lambda is that you will only pay for the compute time for your code which means you will be charged only when your code is running and don’t pay anything when your code is not running.
Let’s get started with the process.
Setup a fresh EC2 instance
First set up a fresh EC2 instance where you will install all the required libraries which your code (model) will be using while running on Lambda. Reason for setting up a new EC2 instance is that you will be configuring the python environment from scratch and install all the required libraries.
Once you have completed that you will zip the entire python environment along with your code which you will be running on Lambda and download it to your local machine and then we will upload it Amazon S3 bucket. Lambda function will use this zip file for execution.
Whenever an event occurred which will invoke the Lambda function, it will use your specified zip file for execution. Say, on the event, I specified that I will run sample.py which is already inside the zip file then Lambda function will look for everything it needs (python environment and required libraries to run the code) in that zip file, and if there is something missing then the execution of your code will fail.
Create a Lambda Function
Once, you have the zip file ready on S3 bucket. Then you can create a new Lambda function.
Click on Create Function and provide a meaningful name and choose Python 3.6 in runtime and may choose permissions which suit best your needs.
Goto your lambda function, in Function Code section you have to either upload a zip file or you can specify the address of the zip stored on S3 bucket as in the screenshot.
FYI — In the screenshot Function is the code file name (Function.py) and handler is the method name defined in the file which will run.
Test your code
You can test your code if it is running correctly or not. Click on the test tab on the top right and configure a test event. You can pass on the input which your code needs and then you can see whether your code is running or not.
As in the screenshot, we are passing the description and run the code. If you are using our code to try then you have to uncomment the line 29 which will read test description and comment line 30 which is reading the input from AWS queue which I will explain now.
AWS SQS (Simple Queue Service)
Create an SQS service by selecting SQS from the designer selection box. It is pretty straight forward. In this SQS, we configured two queues. One is for passing the input to the Lambda function, we have an event trigger on this queue means whenever there is a new message in this queue then it will trigger the lambda to run.
Second is the output queue which will show the predicted value for the input message.
Select the input queue and from the Queue Actions select ‘send a message’. Once you will click the submit button, it will trigger the Lambda function and your code will run and do the prediction and write it to the output queue.
As you can see in the screenshot, messages available for output queue is 7 and input is 0 which means there is no new message is input queue and your lambda function is not running your code.
VII. User feedback to retrain the model
We created a simple angular JS UI for getting user feedback. The idea of user feedback was to get the user’s approval if the classified ticket was correct or not. If the classified ticket was the wrong category then the user has an option to select the correct category from the drop down and click on save which will save the file to the S3 bucket. We will use this file to retrain our model periodically.
For this, we set up another Lambda function which you can schedule to run every day or every week depending on the requirement. It will use the model pickle file to read the model and retrain the model and modify the pickle file.
AWS Lambda is a very good choice for scalable models as you don’t have to worry about provisioning and managing servers. It is easy to deploy models and automatically scales the required resources according to your requirements and you only have to pay if your code is running and thus it is very cost effective.