How to Colorize Black & White Photos with just 100 Lines of Neural Network Code

Black and white photography has a timeless beauty, but there‘s something magical about seeing those photos in color. Colorization of old images has traditionally been done by Photoshop experts who spend hours upon hours researching colors and meticulously converting the photos pixel by pixel. But researchers have now developed algorithms using deep learning that can automatically colorize black and white photos with impressive results.

In this post, we‘ll explore how you can use convolutional neural networks to colorize black and white images with a very simple implementation in Python using the Keras deep learning library. We‘ll walk through the entire process of building the model and training it on a dataset of color images. In the end, you‘ll have your own colorization tool that you can use on any grayscale image.

The key to the approach is to treat the colorization process as an image-to-image mapping problem. We want to build a model that takes in a black-and-white image as input and generates a colorized version as output. This is similar to other tasks like image segmentation or style transfer where neural networks learn to map one image to another.

Here are the basic steps:

Prepare a dataset of color images for training
Convert the images to Lab color space and extract the L channel as the black-and-white version
Train a convolutional neural network to map from L channel back to the ab color channels
Use the trained model to predict the color channels for a new grayscale input and combine with L channel to obtain final result

Let‘s go through each part in more detail. First we need to put together a dataset of color images for the model to learn from. The more images and the more diverse the content and colors, the better the model will hopefully generalize. For this example, let‘s use a dataset of 10,000 images from the Unsplash dataset on FloydHub.

We‘ll download the images and load them in Python as RGB color arrays with the following code:

from skimage.color import rgb2lab, lab2rgb, rgb2gray
from skimage.io import imsave
import numpy as np
import os
from PIL import Image

# Get images
X = []
for filename in os.listdir(‘Unsplash_Images‘):
    X.append(img_to_array(load_img(‘Unsplash_Images/‘+filename)))
X = np.array(X, dtype=float)

# Set up train/test split  
split = int(0.95*len(X))  
Xtrain = X[:split]
Xtrain = 1.0/255*Xtrain

This gives us a 4D array Xtrain with dimensions (number of images, height, width, 3) containing the RGB values normalized between 0 and 1.

Next we need to convert the images to the Lab color space so we can separate out the grayscale and color information. The Lab color space has one channel L for lightness and two channels a and b for color. The L channel is basically a black-and-white version of the image. We‘ll use it as the input to our model, and train the model to predict the corresponding ab channels.

Here are the helper functions to convert between RGB and Lab:

def rgb2lab(input):
    return xyz2lab(rgb2xyz(input))

def lab2rgb(input):
    return xyz2rgb(lab2xyz(input))

And here‘s how we apply the conversion and extract the L and ab channels:

# Convert RGB to Lab
lab = rgb2lab(Xtrain)
X = lab[:,:,:,0] # extract L
X = X.reshape(X.shape+(1,)) # reshape into (num_images,256,256,1)

Y = lab[:,:,:,1:] # extract ab
Y = Y / 128 # scale between -1 and 1

We now have our input X and target Y ready for training. X has shape (num_images, 256, 256, 1) and Y has shape (num_images, 256, 256, 2).

For the model architecture, we‘ll use a U-Net style convolutional neural network. This type of model has been successful in many image-to-image mapping problems. It consists of a contracting path that downsamples the input followed by an expanding path that upsamples to generate the output. There are also skip connections between layers on the contracting and expanding paths to help capture both high-level and low-level features.

Here‘s the code to build the model in Keras:

from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D, concatenate
from keras.models import Model

input_shape = (256, 256, 1)

# Encoder
inputs = Input(shape=input_shape) 
conv1 = Conv2D(64, (3, 3), activation=‘relu‘, padding=‘same‘)(inputs)
conv1 = Conv2D(64, (3, 3), activation=‘relu‘, padding=‘same‘)(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

conv2 = Conv2D(128, (3, 3), activation=‘relu‘, padding=‘same‘)(pool1)
conv2 = Conv2D(128, (3, 3), activation=‘relu‘, padding=‘same‘)(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)

conv3 = Conv2D(256, (3, 3), activation=‘relu‘, padding=‘same‘)(pool2)
conv3 = Conv2D(256, (3, 3), activation=‘relu‘, padding=‘same‘)(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)

# Decoder
conv4 = Conv2D(256, (3, 3), activation=‘relu‘, padding=‘same‘)(pool3)
conv4 = Conv2D(256, (3, 3), activation=‘relu‘, padding=‘same‘)(conv4)
up1   = concatenate([UpSampling2D(size=(2, 2))(conv4), conv3], axis=-1)  

conv5 = Conv2D(128, (3, 3), activation=‘relu‘, padding=‘same‘)(up1)
conv5 = Conv2D(128, (3, 3), activation=‘relu‘, padding=‘same‘)(conv5)
up2   = concatenate([UpSampling2D(size=(2, 2))(conv5), conv2], axis=-1)

conv6 = Conv2D(64, (3, 3), activation=‘relu‘, padding=‘same‘)(up2)
conv6 = Conv2D(64, (3, 3), activation=‘relu‘, padding=‘same‘)(conv6)
up3   = concatenate([UpSampling2D(size=(2, 2))(conv6), conv1], axis=-1)

conv7 = Conv2D(32, (3, 3), activation=‘relu‘, padding=‘same‘)(up3)
conv7 = Conv2D(2, (3, 3), activation=‘tanh‘, padding=‘same‘)(conv7)

model = Model(inputs, conv7)

The encoder downsamples the input to a 32×32 representation. The decoder then upsamples back to the original 256×256 resolution. We use relu activation in the intermediate layers and tanh in the final layer since the ab color channels are normalized between -1 and 1.

To train the model, we use the Adam optimizer and a mean squared error loss. We also use data augmentation via random rotations, flips, and zooms to increase diversity of the training data.

model.compile(optimizer=‘adam‘, loss=‘mse‘)

datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

batch_size = 32
epochs = 100

model.fit_generator(datagen.flow(Xtrain, Ytrain, batch_size=batch_size), 
                    steps_per_epoch=len(Xtrain)/batch_size, epochs=epochs)

After training for 100 epochs, let‘s test the model on some black and white images. We load the images, convert to Lab color space, extract the L channel, and predict the ab channels using our trained model.

color_me = []  
for filename in os.listdir(‘test_images‘):
    color_me.append(img_to_array(load_img(‘test_images/‘+filename)))
color_me = np.array(color_me, dtype=float) 
color_me = rgb2lab(1.0/255*color_me)[:,:,:,0]
color_me = color_me.reshape(color_me.shape+(1,))

output = model.predict(color_me)
output = output * 128

for i in range(len(output)):
    cur = np.zeros((256, 256, 3))
    cur[:,:,0] = color_me[i][:,:,0]
    cur[:,:,1:] = output[i]
    imsave("result/img_"+str(i)+".png", lab2rgb(cur))

And here are the results:

As we can see, the model is able to generate believable colorizations, adding realistic blues to the sky, green to trees and grass, and even appropriate skin tones to faces. The results aren‘t perfect – there are still some patchy artifacts and incorrect color assignments. But it‘s very impressive that this relatively simple model can automatically add color to black and white images.

There are many potential ways to improve these results further:

Use a larger and more diverse training dataset
Experiment with deeper or more sophisticated model architectures like ResNets
Predicting the color for each pixel independently doesn‘t ensure overall color consistency. Techniques like class-rebalancing, global priors, and postprocessing could help.
We could also combine this model with a classifier or object detection model to incorporate semantic information. This could help the model assign more realistic colors to specific objects.

I hope this post gave you a taste of the incredible power of deep learning and neural networks for computer vision problems like colorization. With a straightforward model and a bit of training data, we can already achieve remarkable results.

The field of automated colorization is rapidly advancing, with new techniques being regularly published. If you‘re interested to learn more, I recommend checking out some of these resources:

Interactive Deep Colorization by Richard Zhang et al
Let there be Color! by Satoshi Iizuka et al
DeOldify open source project by Jason Antic

At the end of the day, there‘s still no replacement for the human eye and artistic touch in bringing old photos fully back to life. But deep learning algorithms are getting closer and closer to that bar. It‘s an exciting area that combines the cutting edge of AI with the universal and timeless appeal of visual art and photography.

So find some old black and white photos, hack together a neural network, and see if you can add a bit of color back to a piece of history! The complete code for this tutorial is available on Github. Feel free to use it as a starting point for your own experiments.

How to Colorize Black & White Photos with just 100 Lines of Neural Network Code

Related

How to Build a Machine Learning Model in Rust

This is Why Anyone Can Learn Machine Learning

Object Detection in Google Colab with Fizyr RetinaNet

How to Build Better Machine Learning Models with tf.keras.layers.dense

A Developer‘s Perspective on the History of Machine Translation

How to Benchmark Machine Learning Execution Speed: An In-Depth Guide

Related

Similar Posts