Signal Processing with NumPy II - Image Fourier Transform : FFT & DFT

bogotobogo.com site search:

Functions and values of NumPy FFT

The DFT (Discrete Fourier Transform) is defined as $$A_k = \sum_{m=0}^{n-1} a_m exp\{ -2\pi i \frac{mk}{n} \} \quad\quad k=0,...,n-1.$$ The DFT is in general defined for complex inputs and outputs, and a single-frequency component at linear frequency f is represented by a complex exponential $a_m = exp\{2\pi i f m \Delta t \}$, where $\Delta t$ is the interval for sampling.

Here are the NumPy's fft functions and the values in the result:

$A=fft(a,n)$
$A[0]$ contains the zero-frequency term which is the mean of the signal. It is always purely real for real inputs.
$A[1:n/2]$ contains the positive-frequency terms.
$A[n/2+1:]$ contains the negative-frequency terms in the order of decreasing negative frequency $A[n/2]$ represents both positive and negative Nyquist frequency for an even number of input points. It is also purely real for real input.
$A[(n-1)/2]$ contains the largest positive frequency for an odd number of input points.
$A[(n+1)/2]$ contains the largest negative frequency.
$fftfreq(n)$
returns an array giving the frequencies of corresponding elements in the output.
$fftshift(A)$
shifts transforms and their frequencies to put the zero-frequency components in the middle.
$ifftshift(A)$
undoes the shift done by $fftshift(A)$.

When $a$ is a time-domain signal in $A = fft(a)$

$abs(A)$ : its amplitude spectrum
$\{ abs(A) \}^2$ : its power spectrum
$angle(A)$ : its phase spectrum

bogotobogo.com site search:

Image FFT

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('xfiles.jpg',0)
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))

plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
plt.show()

The '2' in $fft2()$ indicates that we're using 2-D fft. The $fft2()$ provides us the frequency transform which will be a complex array. Its first argument is the input image, which is grayscale. Second argument is optional which decides the size of output array.

If we use interactive shell, the $f$ looks like this:

>>> f.shape
(194, 259)

We can see $fft2()$ returns a complex array:

>>> f
array([[ 842784.00000000    +0.j        , -517752.56442341 -3696.20384447j,
          91285.56071617 +4921.45028062j, ...,
         -46601.18635079 -4814.84303878j,   91285.56071617 -4921.45028062j,
        -517752.56442341 +3696.20384447j],
       [ 123232.14686474+22090.02037498j,    -856.26680325 +9182.45344817j,
        -136191.25175358-51466.45725155j, ...,
          71425.20605824+14903.15993681j, -158452.77640809-31181.5124568j ,
          -3060.27297752+19352.80159523j],
       [ -36743.19890402-14189.86326576j,   28037.83427034+45830.48236385j,
          -5472.34460100  +751.10794956j, ...,
         -17431.81487318-56998.95644394j,  -27977.32223743-17396.44346126j,
          55230.60393901+23180.07097916j],
       ..., 
       [-107889.78457811+41341.29205987j,   15392.03432529 +8349.74473343j,
          61145.55093605-59049.02715308j, ...,
          40527.83113022+19449.36030634j,   73913.64630259-52167.06835863j,
          -7173.60813531 +8395.54640485j],
       [ -36743.19890402+14189.86326576j,   55230.60393901-23180.07097916j,
         -27977.32223743+17396.44346126j, ...,
         -29773.75673263+47038.82271301j,   -5472.34460100  -751.10794956j,
          28037.83427034-45830.48236385j],
       [ 123232.14686474-22090.02037498j,   -3060.27297752-19352.80159523j,
        -158452.77640809+31181.5124568j , ...,
          58992.94700240 +3233.35145839j, -136191.25175358+51466.45725155j,
           -856.26680325 -9182.45344817j]])

We did $ffshift()$ because we want to place the zero-frequency component to the center of the spectrum. In other words, once we got the result, zero frequency component (DC component) will be at top left corner. Because we want to bring it to center, we needed to shift the result by $\frac{n}{2}$ in both the directions using $np.fft.fftshift()$.

Once we found the frequency transform, we can find the magnitude spectrum:

magnitude_spectrum = 20*np.log(np.abs(fshift))

Now we can see more whiter region at the center showing we have more low frequency content.

If we undo the shift using $ffshift()$:

f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
fshift = np.fft.ifftshift(fshift)

The picture will look like this:

More Samples

Here are more fft samples:

DFT and inverse DFT

We found the frequency transform for an image in the previous section. So, now we can do some operations in frequency domain, like high pass filtering (HPF) and reconstruct the image using inverse DFT.

We can simply remove the low frequencies by masking with a rectangular window of size 60x60. Then, apply the inverse shift using $ifftshift()$ so that DC component again come at the top-left corner. Then, find inverse FFT using $ifft2()$ function. The result, again, will be a complex number. We can take its absolute value.

The DFT (Discrete Fourier Transform) is defined as: $$A_k = \sum_{m=0}^{n-1} a_m exp\{ -2\pi i \frac{mk}{n} \} \quad\quad k=0,...,n-1.$$ The inverse DFT is defined as this: $$A_m = \frac{1}{n} \sum_{k=0}^{n-1} A_k exp\{ 2\pi i \frac{mk}{n} \} \quad\quad m=0,...,n-1.$$

Code

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('xfiles.jpg',0)

# fft to convert the image to freq domain 
f = np.fft.fft2(img)

# shift the center
fshift = np.fft.fftshift(f)

rows, cols = img.shape
crow,ccol = rows/2 , cols/2

# remove the low frequencies by masking with a rectangular window of size 60x60
# High Pass Filter (HPF)
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0

# shift back (we shifted the center before)
f_ishift = np.fft.ifftshift(fshift)

# inverse fft to get the image back 
img_back = np.fft.ifft2(f_ishift)

img_back = np.abs(img_back)

plt.subplot(131),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(img_back, cmap = 'gray')
plt.title('Image after HPF'), plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(img_back)
plt.title('Fianl Result'), plt.xticks([]), plt.yticks([])

plt.show()

The result shows High Pass Filtering (HPF) is an edge detection operation. This also shows that most of the image data is present in the Low frequency region of the spectrum.

If we watch the result more closely, especially the last image, we can see some artifacts which shows some ripple like structures there, and it is called ringing effects. It is caused by the rectangular window we used for masking. This mask is converted to sine shape which causes this problem. So rectangular windows is not used for filtering. Better option is Gaussian Windows.

Now that we know how to find DFT, IDFT etc. in Numpy. So, let's do it with OpenCV but in next chapter.

References