Sunday, July 21, 2013

Fourier transforms

In this exercise, we worked with Fourier transforms and convolution of images using the built-in "Fast Fourier" transform function in Scilab. Fourier transforming a signal will give us a representation of the distribution of the frequencies present in the said signal. A signal can be decomposed into a set of sines and cosines, with amplitude and frequencies depending on the computation of the discrete Fourier transform. 

An image is basically a signal, since it is a collection of information stored in a series of pixels. In photography, for example, information (color, brightness, etc) is transferred from the subject to the pixels of the camera. Depending on the properties of a portion of the subject, the region of the camera's sensor will store a corresponding signal. After processing these signals, we are able to view the subject in a photograph, which would hopefully be as colorful (or vibrant or whatever) as the subject.

Anyways, so there. Scilab has the function fft() and fft2() to perform a discrete Fourier transform, for one-dimensional and two-dimentional signal respectively. In our case, since we're dealing with images, which are 2D, we will be using fft2. Additionally, these two functions also perform the inverse Fourier transform, which is quite handy.


Figure 1. Transformation of two images to frequency space and back to the original.

Side note: It took me so long to finish this exercise because I can't figure out the problems in my Scilab code. Actually there were no errors. It's just that I didn't use the function mat2gray(), which converts a matrix (containing the information from the Fourier transformed image) to a grayscale image. Meryl told me to add this function to create the desired image of the Fourier transform and for the other images. Without her suggestion, I wouldn't be able to do the entire exercise. Seriously hahaha.


Figure 2. Conversion of the matrix to an image using mat2gray()


Going back to Figure 1, after performing Fourier transform, we still need to use the function fftshift(). As seen in the second column of Figure 1, the white regions, which contain the information from the Fourier transform, are distributed to the four corners of the image. Using fftshift() forces these white regions to be placed at the center of the image. As seen on the Fourier transform of the circular aperture, the result is an Airy pattern. Also, the inverse Fourier transform of the Fourier transform of the image containing the letter "A" is inverted, as opposed to the original image.

Simulation of an imaging device

Due to the finite sizes of the optical components of an imaging device, a photograph can have a lot of differences from the subject. The quality of a photograph differs from the subject as a result of the latter's convolution with the transfer function of the imaging device. A set of lenses with large diameters will produce sharper images compared to small diameters, as more rays from the original image will be allowed to pass through, which will then contribute to a wider range of information.

If we have a convolution of two functions, we can obtain the result from the multiplication of the Fourier transform of the subject and the Fourier transform of an imaging device. Afterwards, we perform inverse Fourier transform on the product. 

In an optical system, lenses automatically perform Fourier transform on an image [1]. We can think of it as the aperture being already at the Fourier plane, and we simply multiply it with the Fourier transform of the image. 


Figure 3. Simulation of an imaging device of different aperture diameters.

As expected, those with smaller diameters produced blurred images compared to those with larger diameters.

Template matching

We can also perform template matching using correlation. Similar with the previous one, we work with the Fourier transforms of two images. In this part, we use an image containing the text:
THE RAIN IN SPAIN STAYS MAINLY IN THE PLAIN
and a template image, of the same size as the first image, containing only the letter "A", which is what we need. Using correlation, we can determine the location of the letter "A" (from the template) on the first image. For both images, the font sizes and styles are similar. 

However, instead of multiplying the Fourier transforms right away, we first perform complex conjugation (using the function conj() ) on the template image. 

Figure 4. Text image and template image

Using the described method, I got this image:

Figure 5. Correlated image

By correlating the two images, the peaks (indicated by the bright spots) occurred at the locations where A is found on the text image. As you can see, the letter A is highly smeared on every point on the letters from the text image, causing the blur on the correlated image. At the regions where the template smeared exactly on the same letter, bright spots formed, indicating a strong correlation.

Performing thresholding on Figure 5, I was able to locate the bright spots:

Figure 6. Thresholding of the correlated image

This was produced using the function SegmentByThreshold(), with the threshold set to 0.8. These bright spots are the locations of the letter A in the text image. 

Edge Detection

Edge detection can also be performed by convolving an image with a 3x3 matrix of a specific pattern. The matrices that were created consist of elements having a sum of zero. In the following image, we see each matrix, followed by a 128x128 pixel image with the matrix at the center, and the resulting image after convolving with the original image (as seen in Figure 3).

Figure 7. Convolution of the Fourier transform of the original image and a 3x3 matrix.

The first matrix was able to detect the horizontal edges of the image containing the word "VIP" while the second matrix was able to detect the vertical edges. In addition, the diagonal lines were also detected by the second matrix since these lines are composed of short vertical lines. The final matrix was able to detect both the vertical and horizontal edges on the original image.

For this activity, I would give myself a grade of 10 since I was able to perform the activity correctly.

Reference
Activity 7- Fourier transform model of image transformation. M. Soriano. 2013.


Thursday, July 4, 2013

Revealing dark secrets using Histogram manipulation

Or not.

In this activity, I tried to manipulate the histogram of a dark image to enhance the contrast. This allows the details in the dark regions of the image to appear.

Wait a second, what exactly is a histogram?

I mentioned the term previously but I haven't discussed it yet. My bad.

A histogram is a representation of the occurrences of data points. In image processing, we are concerned with the brightness histogram of an image. Each pixel in the image will have a specific brightness, and thus creating a histogram of the image will display the frequency of each level of brightness (from 0 to 255). In brightness histograms, we will not be dealing with the location of the points containing these levels of brightness. Since it displays the frequency of each value, it can also be viewed as a probability distribution. Hereafter, histogram and/or probability distribution function (PDF) will be used instead of brightness histogram for simplicity.

For the activity, we will be dealing with grayscale images. We expect that for a dark image, the histogram will be skewed towards the far left, near the zero value of intensity.

So before we proceed, here's a summary of the activity:


Figure 1. Flowchart

To be able to understand this, I'll give an example.

In this picture, (I can't remember who took this photo, I just found this somewhere in my astrophoto folders), the small amount of light makes it difficult to see a lot of details at the bottom part. Well, the CS basketball court would be the best place to set up a telescope because there are no lamps nearby (except those around NIP) and it's safe. So if you want to observe the night sky with a telescope, this is one of the nicest place in UP for observation. :D

In any case, at the lower part of the image, I'm there preparing the telescope. And there are bags nearby. But since it's dark, it's difficult to notice.

Figure 2. Image to be manipulated

Converting this to grayscale with Scilab:

Figure 3. Grayscale version of Figure 2

Why do we need to convert it? This makes it simpler, as we only need to manipulate the brightness of the image. Ma'am Soriano told us that we can split the colored image into red, green, and blue channels and individually convert them to grayscale. With these grayscaled channels, the brightness of all the three colors can be manipulated. How? I don't know yet, Ma'am Soriano is yet to teach that method in their Applied Physics 187 class. TOO BAD, I don't have Applied Physics 187.

Here's the normalized histogram/PDF of Figure 3:
Figure 4. Normalized histogram of Figure 3. The frequencies of the intensities are divided by the total number of pixels in the image to normalize the histogram.

As you can see, the frequencies are high only at the 0-50 intensity levels. The image really is dark. So to enhance it, we introduce the concept of cumulative distribution function (CDF). Simply put, it is obtained by adding the normalized frequency at each value (in this case the intensity/ brightness level). The CDF of Figure 4 is:
Figure 5. Cumulative distribution function of the histogram in Figure 4.

We need another CDF, which is what we want to achieve for the original image. First, we use a linear CDF as our target CDF. For all brightness level in the image, there exists a CDF value in Figure 5. We then find this CDF value on the second CDF and determine the corresponding brightness level.


Figure 6. Obtaining the new pixel value using the original CDF (left) and the desired CDF (right)

If it's hard to process what I said in the previous paragraph, maybe Figure 6 can help. This shows that all pixels in the grayscale image with 30 as the brightness level will now have an intensity of 210. The final image, after applying this transformation is:

Figure 7. Enhanced image using a linear CDF

Works like magic. Haha. There, you can see me clearly and also my friend's bag. In addition, the letters from the word "NATIONAL" became more visible. You can also see the three-point "line" and the difference in the type/color of paint between the parts of the basketball court.

Here are the PDF's and CDF's of the grayscale and generated image:
Figure 8. Histogram and CDF of the image before (top) and after (bottom)

The spread of the histogram, after linearization of the CDF, became wider across 0 to 255. Also, the plot at the lower right of Figure 8 shows the CDF of the new image, which shows that indeed the CDF became linearized.

I also tried using a Gaussian distribution with varying standard deviation to manipulate the image. And here are the generated image, followed by the PDF's and CDF's.


Figure 9. Enhanced grayscale image. Standard deviation of 20 brightness levels were used for the Gaussian distribution, centered at 255/2.

Figure 10. Histogram and CDF of the image before (top) and after (bottom)

Figure 11. Enhanced grayscale image. Standard deviation of 40 brightness levels were used for the Gaussian distribution, centered at 255/2.

Figure 12. Histogram and CDF of the image before (top) and after (bottom)

You can see that the transitions between different objects in Figure 11 are more pronounced. This is because the brightness levels near zero and 255 have more frequency, as seen in the lower left plot in Figure 12, compared to that in Figure 10. Since the brightness levels peak at the mean brightness level and the distribution is narrow (standard deviation of 20 levels). Going back to the original concept, what we wanted was a histogram with values scattered from 0 to 255 right? We weren't able to see details in Figure 3 because its histogram is skewed towards zero. Even though the image became brighter in Figure 9, the distinctions between adjacent objects are not clear. With Figure 11, we obtained a relatively better image.


Let's use another image!

Figure 13. New image to be manipulated.


I took this image last February at the PAGASA Observatory in UP Diliman. The kids in the photo were having a field trip, waiting for the sun to set. During that night, they were able to spot Jupiter using the 45cm telescope of PAGASA. And here is its grayscaled version:

Figure 14. Grayscale version of Figure 13

Applying the same linear CDF, I obtained this image, followed by the original and resulting histograms and CDF's:

Figure 15. Resulting image after using a linear CDF


Figure 14. Histogram and CDF of the image before (top) and after (bottom)

Surprisingly, the enhanced image showed the shadow of the kids produced by the setting sun. The details of the plants, bushes, and trees appeared, as well as those on the building at the left (National Institute of Molecular Biology Building) and at the right (Kamia Residence Hall, I guess?). 

It really worked like magic haha. 

I also tried reproducing Figure 14 with GIMP using the Curves function under the Color Menu:

Figure 15. Enhancing the image in Figure 14 using GIMP.

Figure 16. Resulting image from GIMP

Figure 17. Histogram of Figure 16

With GIMP, I was able to make the histogram distributed over all the brightness levels, which is somehow similar to the lower left plot in Figure 14.

Out of 10, I would give myself a grade of 10 since I performed all the tasks well. Without Alix, I could not have done this activity. haha. At first, I manipulated the image pixel by pixel with the use of loop within a loop. It took an eternity to finish running, and produced nothing. Then she approached me and gave an overview of how she performed the manipulation. She taught me how to use the find() function properly. What I had in mind at first was to look for the pixel coordinates with certain brightness levels and collect their indices. She told me that I can just call the indices directly instead of storing them by appending the find() function to the matrix:
imgray2(find(imgray1==0)) = 255;
This line, for example, uses two matrices of equal dimensions. The first one, imgray1, contains the brightness levels of all the pixels from an image. With find(imgray1==0), the indices of all black pixels will be collected. putting that inside imgray2() calls these indices from the second matrix, and then we assign them with a value of 255. Creating an image from imgray2, we will see that it will be the same as the original image, only that the black pixels from the original image will become white.

So there you have it. manipulation of images using histograms. Hope you enjoyed it. :D

References:
[1]. Soriano, M. Applied Physics 186 activity manual - enhancement by histogram manipulation.