Blog: Neural Style Transfer — Image style transfer with PyTorch
Neural image transfer uses the style of one image and content from another image to generate a hybrid image transferring content and style from the respective images.
I am writing this to further my own understanding and obtained most of the code from PyTorch tutorials. The following is the example that I ran of a muay thai fighter (Buakaw Banchamek) and a Picasso painting and tried to get the transfer the Picasso style over the fighter’s picture.
We start by:
- Importing libraries
- Loading images
We then download and initialize the VGG19 pretrained model for Convolutional Neural Network. We have to set it to eval mode & pretrained = True so that it does not get trained and change the pretrained weights. Since, I am running this on GPU, I have set the model on GPU where device is ‘cuda’.
cnn = models.vgg19(pretrained=True).features.to(device).eval()
Next we have to initiate the losses for content and style as classes. Both the losses are MSE losses (Mean Squared Error losses). Both these losses are not training losses that are used to adjust weights or train the model.
We also create a class for normalizing images easily.
We need to add our content loss and style loss layers immediately after the convolution layer they are detecting. To do this we must create a new Sequential module that has content loss and style loss modules correctly inserted.
Finally, we must define a function that performs the neural transfer. For each iteration of the networks, it is fed an updated input and computes new losses (sum of style & content losses). We will run the
backward methods of each loss module to dynamicaly compute their gradients. The optimizer requires a “closure” function, which reevaluates the model and returns the loss.
Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning.
After all the functions have been designed, we use run the style transfer model as follows:
You can save this image by using torchvision: torchvision.utils.save_image(output,’images/final_output_img.jpg’)
Other pretrained models such as resnet, alexnet, squeezenet, vgg, densenet etc for feature extraction and better prediction accuracy.
Fine-tuning the learning rate and increasing epochs would further improve the image style transfer.
I thought this topic was pretty interesting and hence decided to go through the PyTorch tutorials to understand the methodology further. I have also used countless MOOCs and online resources to understand image style transfer better. Please let me know what you guys think.