Colorizing the Prokudin-Gorskii photo collection
Sergei Mikhailovich Prokudin-Gorskii was a pioneer in the field of color photography. From 1909-1915, he traveled the Russian Empire, taking a series of three black and white photos through a red, blue, and green filters respectively (see the example below or to the right).
His ultimate goal was to create a projector that was capable of overlaying these black and white photographs into a colored image. Unfortunately, he was forced out of the country by the Russian Revolution and was unable to acheive his goal during his lifetime. However, we now have all of the necessary technology to realize his life's work.
The rest of this page details how we can make make an automated restoration process for these images.
Theoretically, we should just be able to stack the 3 greyscale images into an RGB image tensor and get a color image:
We see, however, that without any pre processing, the images are not well aligned, producing poor results. We need an automated way of aligning the channels.
I started off with a window search of [-15, 15] pixels, starting from the top left pixel of the image. I experimented with both maximizing the normalized cross correlation (NCC), and minimizing the L2 distance. I found that the L2 distance generally yielded more reliable results. Here are the same images, aligned by minimizining the L2 distance over the window search:
The image of the port aligned properly, so we are on the right track with this method. However, the other images are poorly aligned. The reason this happens is because the messy edges around the image make the l2 distance similarity metric very unreliable. By cropping the edges of each channel before aligning them, we can make the alignment algorithm much more reliable:
The window search algorithm detailed above scales quadratically with respect to the dimensions of the window. For high resolution images where the size of the window must be larger in order to acheive the proper alignment, the brute force algorithm is intractably slow.
By using an algorithm called an image pyramid, we can do the alignment process in logarithmic time, even if the images are egregious misaligned. The way this algorithm works is by downsampling the image until it gets to a certain base resolution. We do the alignment on this low-fidelity image, which gives us the proper alignment for this low resolution image. Now, we can unwind the stack of images one by one. At each step, we know about what the offset is, so we only need to look over a very small window.
In my case, I do a 20 x 20 grid search once the image is downsampled to be about 100 x 100 pixels. During each stack pop step, I do a 3 x 3 grid search to increase the fidelity of the estimate until we arrive at the original image.
Aligning the images based on the L2 distance betwen the RGB image matricies often results in issues. Crucially, if the 3 channels have large differences between them, the images may fail to align properly.
Another way we could align the images is by first applying an edge detector to each RGB channel before calculating the L2 difference. I noticed this generally aligned the images more sharply, and had fewer instance of completely failing to align the images:
Red Channel Edges
Green Channel Edges
Blue Channel Edges
Aligned Image Tensor
By scaling the image such that the smallest intesity pixel is 0, and the highest intensity is 1, we can increase the image's contrast and make it sharper
Admittedly, the difference is not overly large, but the images with autocontrast to appear slightly sharper / brighter.
Here are the given images, processed with my best stack: