Photo by Joel Muniz on Unsplash

Introduction

Generating original content is the future of Artificial Intelligence. In the implementation documented in this post, we talk about a method to generate lyrics for the chosen six rap artists. The goal of this post is to outline a pipeline for anyone who is pursuing a similar project.

Note: Sample outputs are at the end of the article. There is also a Google Colab Notebook attached for people who would like to generate unique lyrics on their own.

We will try to cover the intuition behind the decisions made as clearly as possible. This post is reasonably beginner friendly, but we would recommend having a basic understanding of Recurrent Neural Networks (RNN) and Long Short Term Memory Networks (LSTM) before reading this article to get a better understanding of what is being discussed. For people short on time, there is a concise explanation incorporated into the post later on. This approach follows Robbie Barrat’s Rapping Neural Network method.


Data Gathering

We started the process by gathering data for six rappers, i.e., A$AP Rocky, Eminem, J.Cole, Kanye West, Snoop Dogg, and Tyler The Creator.
We could have easily made a small script to scrape the web for lyrics to the songs of each artist; however, we preferred to keep the data entry manual so that we have better control over what we have in the dataset.
Since the dataset of 50–60 songs is quite small, any disparities in each artist’s style could affect the result quite drastically. Therefore, we manually copied and pasted all lyrics featuring each artist into a separate text file. Things such as [intro] and [verse 1: …] are removed from the data. The text files are placed in different sub-directories, named appropriately (this would be important later). 
We then built a quick python script to merge all lyrics for each artist into one file before we start to preprocess the data.

Script to combine scripts in the current directory. Move this script to the desired directory before running.

Data Preprocessing

First, we split the lyrics into bars for better comprehension of how rhymes work. We have taken each new sentence as a new bar (generally, all songs on Genius are structured this way).

To understand how to use the data, we needed to know how basic rhyming schemes work.
Intuitively, we chose a natural approach to take into consideration the last word of each bar. Instead of installing the “behemoth” that is NLTK, we use pronouncing by Allison Parrish to generate rhyming words for the last word of each bar. We then append this to our rhymes_list.
Once we have this set, we use the last two letter of each word in the list to store a basic rhyme_scheme.

The complete code can be found here

Building the Model

Taking each bar as an input, we count the number of syllables. Then, using rhyme_index, we represent each bar’s rhyming word as an integer. 
This approach is quite similar to count vectorizer. 
We take the first two lines and the next two lines, and then we reduce the error between them by finding out the mean-squared error.

Then, we use Markovify by Jeremy Singer-Vine, to create Markov chains from each line of the lyric data specific to the artist.

Note: We use markovify.NewlineText instead of markovify.Text to delineate sentences since the data is arranged in a way where we need each new line as a new sentence in the Markov chain

To predict the next line based on this line, we will be using Recurrent Neural Networks (RNN), or more specifically, we will be using a Long Short Term Memory Network (LSTM)
For people unaware of what these are, here is a quick rundown.

Recurrent Neural Networks (RNN)

The most widespread representation of a neural network is a web-like figure, containing multiple input neurons and one or more output neurons.
Perhaps, something like this:

This network works on the idea that there are multiple input features, and the order of the input is insignificant. This approach works just fine for inputs which are non-consecutive; however, in our case, the order of the input is of the utmost importance. If we do not consider the order of the input, we will be training our model incorrectly, and we will end up with gibberish in our predictions.

RNNs address this problem by creating a network which takes activations from previous nodes as a parameter for the next one while maintaining a continuous flow of inputs.

The repeating model in an RNN contains a single layer; Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Long Short Term Memory Network (LSTM)

The reason why we do not use basic RNNs for our implementation is that they lack an essential feature that we require. RNNs only consider one previous prediction. Therefore, when we have a longer sentence, an RNN model would find it hard to predict coherent sentences simply because it cannot understand the context. Introducing LSTMs:

The repeating module in an LSTM contains four interacting layers; Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

LSTMs solve this problem by adding a little context to each node. What that means is that an LSTM would remember what has happened in previous nodes, therefore preserving context and making generally better predictions for purposes where context is important. LSTM networks generally do a better job of understanding long term dependencies in sequential data.

However, in this approach, we do not use LSTMs to generate the lyrics. The lyrics are generated through the Markov chains. This is part of the reason that sometimes while using this model, the output lines are somewhat incoherent.
We use the LSTM network to capture the structure of the song rather than the words themselves.

Creating the Network

We use Keras as a framework to implement an eight-layer LSTM network. We optimise the network by using RMSprop optimiser.

The complete code can be found here

Training the Model

While training the model, we keep a batch size of two, and we run it for five epochs.

The complete code can be found here

Generating the Lyrics

After training the network on the processed data, we proceed to generate the lyrics. To generate lyrics, we choose a random initial index from the range of the length of input data len(lyrics). We use this randominitial_index to find random initial_lines to initiate the rap.
Using these initial_lines, we will generate an array of starting_input, which contains a tuple of the following form:

(desired rhyme, desired count of syllables)

Using this starting_input, we generate more vectors with the same formatting. The generated vectors are called rap_vectors. The rap_vectors are then converted into songs.

A major chunk of the code is redacted from this Gist; complete code can be found here.

Generated Lyrics

Below you can find a few examples of bars generated by our implementation.

Note: The lyrics generated are explicit and completely uncensored. Please view at you own discretion.

Tyler, the Creator

You muthafuckas want war, then come get it for the night
I’m insecure and start to itch my throat
So, what’s going on, Wolf? Talk to me, I’m going full monty
This my Zombie Circus, I hope you die in a suit

All I needed was a fucking unicorn,
The only thing I regret
You Nigerian fuck, now I’m making plates, you just repeat
And leave it dripping green and finger filled with hate

If you got sucked in that Bugatti
Evident that I’m a fucking Nazi
We can kick it tonight
I don’t feel right
But in my passenger seat
Now she thinks that I don’t know, we’ll never fucking meet

Kanye West

Would you ride with Ne-Yo, if he say she wanna move South
She in her Hampton mouth
I am the day to night
Light like when people don’t hate then it won’t be right
You say you feel my heart,
But I can’t fight

Them other niggas have seen my story, my glory
Platinum to go, tell me that ain’t okay no more
With the power to you in the store
And give me Life, I’m out of school seems so secure

Eminem

So she’s been on the cable channel
Now I know it
They take a shit
I sit back with a headache and I sit
That wasn’t my beef
We touch, I feel amazin’, and I’m — 
Man, we’re losing him, he won’t have it, he wasn’t gonna go after him
She fucking hates me and call me

So she’s been on the floor stacked against the wall of shame
And they wonder why I rise and I love that name
Now you’re in the eyeball
Sometimes I wanna get one ’cause they feed me the signal
But in fact, I see you smile
Am I lucky to be outta your mind and outta control
It’s lust, it’s torturous, you must be gone off that water bottle
Why are you there? I love y’all too much to walk in the mail

J.Cole

She fine enough to cure blindness
But I’m in the sadness
Ay, dear Lord, can I last in this?
Your wish is my canvas

A$AP Rocky

F-F-Faded, drinking codeine and bricks for sale
And them college girls write a nigga “be you”, fuck tough be cool
Sell the whole hood look like I’m awake
I know your heart broke
You’re crying on my whores, they be fake
Told me I should die before I wake
My niggas, the only folk
I can share my soul to take

Snoop Dogg

Some niggas think they Vietnamese, we gonna get you to this
You know the G with the fitness
Let’s go get a witness?
What up motherfuckers, this is the premise
Big 808, now feel the breeze for s***
So what you wanna party with us
Checking out your stress
And the G’s and 30 ki’s!
So you can vanish
I’m still on the news
And last but not least my nigga 2pac rest in peace

To generate new unique lyrics, visit the Google Colab Notebook, go to Runtime in the menu bar and hit Run All. Just wait a few seconds while the lyrics are generated and -voila- we have got ourselves an entirely new track from an artist of our choice.


I hope you got something out of this blog. If you found this post illuminating at all, consider 👏 clapping👏 for this post and following for more upcoming content.

Source: Artificial Intelligence on Medium