Simedius

Sunday, June 19, 2022

%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread
import scipy.ndimage

Background

"Convolution" refers to a process similar to "mixing" two functions. It can be performed in one dimension (e.g. on audio data), or many dimensions. In this problem, we'll look at 2D convolution of images and what we can do with it.

Let's first read an image. Once again, to make things easier, we'll separate the channels. We can work on all three channels separately or at once but it's easier to work with one channel only.

original_image = imread("https://upload.wikimedia.org/wikipedia/commons/d/d9/Norwegian_Forest_Cat_Portrait.JPG")

def display(image, title=None):
    # If there is only one channel to show, display it as grayscale
    cm = None
    if(len(image.shape)) == 2:
        cm = "gray"
    plt.figure(figsize = (5, 10))
    plt.imshow(image, cmap = cm)
    plt.xticks([])
    plt.yticks([])

    if title:
        plt.title(title)

    plt.show()

display(original_image)
r = original_image[:, :, 0]
display(r)

Convolution means taking a special square matrix (usually 3x3 or 5x5), called a convolution kernel and applying it to the image like this: the central pixel of the resulting image is the sum of element-wise products between the image and the kernel:

After that, the kernel moves 1px to the right and contiinues. It "slides" across the entire image. The edge pixels are a bit problematic but there are several ways to deal with that. The most common way is to copy whatever pixel value happened to be at the border.

The algorithm is always the same. The output depends on the kernel. Different kernels produce different results: some detect edges (lines), others detect corners; some apply blurring and sharpening; some remove noise, etc.

The results can be useful for analyzing what's on the image, or just for artistic purposes.

Let's examine this filter, for example:

$$ F = \begin{bmatrix} 1/9 & 1/9 & 1/9 \ 1/9 & 1/9 & 1/9 \ 1/9 & 1/9 & 1/9 \end{bmatrix} $$

This calculates the average of all surrounding pixels and basically smooths the image.

Note that in order to preserve brightness, the sum of all elements in $F$ must be equal to 1. If it's not, the image will be darker or brighter (which may or may not be a desired effect).

scipy.ndimage has a method for performing 1D and multi-dimensional convolution. Read the docs here.

Tasks

Apply the convolution. To see better how it performs, you can plot only a part of the image - this will zoom the entire thing. Compare the "before" and "after" images.

size = 5
kernel = np.ones(shape=(size,size)) / size ** 2
result = scipy.ndimage.convolve(r, kernel)
display(r[300:500, 300:500], "Original")
display(result[300:500, 300:500], "Blurred")

Let us explore the horizontal lines and vetical lines kernels.

hkernel = [
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1],
]
vkernel = [
    [1, 1, 1],
    [0, 0, 0],
    [-1, -1, -1],
]

hresult = scipy.ndimage.convolve(r, hkernel)
vresult = scipy.ndimage.convolve(r, vkernel)
display(hresult, "Horizontal Lines Kernel")
display(vresult, "Vertical Lines Kernel")

Play around with more kernels (they're also called filters). You can find examples on the Internet, or you can create your own. Have fun :).

Try these filters:

Gaussian blur
Sobel edge detector - vertical, horizontal
Corner detector
Gradient detector
Sharpening
Unsharp mask

For each filter, show the result before and after its application.

Sources: <a href="https://en.wikipedia.org/wiki/Kernel_(image_processing)">Wikipedia</a>, online playground, Image Kernels explained visually.

Optional: Think about you might use edge, corner and gradient detectors in image processing.

1. Gaussian Blur

The Gaussian smoothing operator is a 2-D convolution operator that is used to blur images and remove detail and noise. In this sense it is similar to the mean filter, but it uses a different kernel that represents the shape of a Gaussian ('bell-shaped') hump. From an engineering perspective, applying a gaussian blur filter results in applying a low pass filter that removes high frequencies. Howerver, it is worth noting that Gaussin low pass filters are not very popular in one dimensional signal processing, but rather in 2D setting such as image processing, because it effectively simulates visually the optical blur. Also the gaussian blur kernel is non-negative in all elements, assuring that it will always produce a valid image (the pixels are always between 0 and 255). (Source)

To calculate the values of its elements, the Gaussian kernel uses the values of the 2D Gaussian distribution:

$$ G(x, y)=\frac{1}{2\pi \sigma ^{2}}e^{-\frac{x^{2}+y^{2}}{2\sigma ^{2}}} $$

# Adapted from https://www.geeksforgeeks.org/gaussian-filter-generation-c/
def gaussian(size):
    assert size % 2 == 1

    kernel = np.zeros((size, size))
    sigma = (size + 1) / 6.0; # A gaussian kernel requires 6 * sigma - 1 values
    r = s = 2.0 * sigma * sigma;

    # sum is for normalization
    final_sum = 0.0;

    # generating the kernel
    min_range, max_range = -(size // 2), size - (size // 2)
    for x in range(min_range, max_range, 1):
        for y in range(min_range, max_range, 1):
            r = np.sqrt(x * x + y * y);
            kernel[x - min_range][y - min_range] = (np.exp(-(r * r) / s)) / (np.pi * s);
            final_sum += kernel[x - min_range][y - min_range];

    # normalising the Kernel
    return np.divide(kernel, final_sum)

kernel = gaussian(25)

result = scipy.ndimage.convolve(r, kernel)
display(r, "Original")
display(result, "Gaussian Blur 25x25")

2. Sobel edge detector - vertical, horizontal

A Sobel edge detection operator consists of a pair of convolution kernels where the second kernel is simply the transpose of the first. These kernels are designed to respond maximally to edges running vertically and horizontally relative to the pixel grid, with one kernel for each perpendicular orientation. (Source)

$$ S_x = \begin{pmatrix} +1&0&-1 \ +2&0&-2 \ +1&0&-1 \end{pmatrix}, S_y = \begin{pmatrix} +1&+2&+1 \ 0&0&0 \ -1&-2&-1 \end{pmatrix}$$

sobel_hkernel = [
    [1, 0, -1],
    [2, 0, -2],
    [1, 0, -1],
]
sobel_vkernel = [
    [1, 2, 1],
    [0, 0, 0],
    [-1, -2, -1],
]

hresult = scipy.ndimage.convolve(r, sobel_hkernel)
vresult = scipy.ndimage.convolve(r, sobel_vkernel)
display(hresult, "Sobel edge detector - vertical")
display(vresult, "Sobel edge detector - horizontal")

3. Corner Detector

Edge detectors such as the Sobel Edge Detector or Canny Edge detector perform poorly at corners. Corners differ from edges and flat regions as follows:

If there is no change in the intensity of pixels surrounding a region then this is a flat region.
If there is no change only in the one direction called the edge direction then this is an egde region.
If there is significant change in all directions, then this is a corner region.

One method for corner detection is Harris corner detection and it leverages image gradients. Depending on the value obtained by the response function $R$, is possible to determine if a region contains a flat region, an edge, or a corner.

$$ R = \text{det}(M) - k(\text{trace}(M))^2 $$

$$ M = \sum \limits _{xy} w(x,y) \begin{pmatrix} I_xI_x&I_xI_y \ I_xI_y&I_yI_y \end{pmatrix} $$

The $I_x$ and $I_y$ present in represent the image gradients along the $x$-axis and $y$-axis.

# Adapted from https://www.geekering.com/programming-languages/python/brunorsilva/harris-corner-detector-python/
window_size = 5
threshold = 0.30
k = 0.04

height = r.shape[0]
width = r.shape[1]
matrix_R = np.zeros((height,width))

# Apply a gaussian filter
gaussian_r = scipy.ndimage.convolve(r, gaussian(3))

# Calculate the gradient
dy, dx = np.gradient(gaussian_r)
dx2 = np.square(dx)
dy2 = np.square(dy)
dxy = dx * dy
offset = window_size // 2

for y in range(offset, height-offset):
    for x in range(offset, width-offset):
        # Calculate the sum fo the products of the derivates for each pixel
        Sx2 = np.sum(dx2[y-offset:y+1+offset, x-offset:x+1+offset])
        Sy2 = np.sum(dy2[y-offset:y+1+offset, x-offset:x+1+offset])
        Sxy = np.sum(dxy[y-offset:y+1+offset, x-offset:x+1+offset])

        # Define the matrix H(x,y)=[[Sx2,Sxy],[Sxy,Sy2]]
        H = np.array([
            [Sx2,Sxy],
            [Sxy,Sy2]
        ])

        det = np.linalg.det(H)
        tr = np.matrix.trace(H)
        matrix_R[y-offset, x-offset] = det-k*(tr**2)

for y in range(offset, height-offset):
    for x in range(offset, width-offset):
        value = matrix_R[y, x]
        gaussian_r[y, x] = 255.0 if value > threshold else 0

display(gaussian_r)

4. Gradient Detector

Am image gradient is defined as a directional change in image intensity. At each pixel of the input (grayscale) image, a gradient measures the change in pixel intensity in a given direction. By estimating the direction or orientation along with the magnitude (i.e. how strong the change in direction is), we are able to detect regions of an image that look like edges. besides the Sobel edge detector that we have already discussed (the vertical and the horizontal protions correspond to the gradients in that particular direction), there are others such as the Prewitt, which formally computes an approximation of the gradient of the image intensity function. (Source).

prewitt_hkernel = [
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1],
]
prewitt_vkernel = [
    [1, 1, 1],
    [0, 0, 0],
    [-1, -1, -1],
]

hresult = scipy.ndimage.convolve(r, prewitt_hkernel)
vresult = scipy.ndimage.convolve(r, prewitt_vkernel)
display(hresult, "Prewitt edge detector - vertical")
display(vresult, "Prewitt edge detector - horizontal")

5. Sharpening

Sharpening an image consists in adding a high pass filtered result to the original image where the high pass filter result is the diffirence between the original image and a low pass filter such as the Gaussian Blur filter. (Source).

low_pass = scipy.ndimage.convolve(r, gaussian(3))
high_pass = r - low_pass
display(r, "Original image")
display(high_pass, "High pass filter")
display(r + high_pass, "Sharpenned image")

6. Unsharp mask

The unsharpening mask filter enhances the edges and other high frequency components in an image via a procedure which subtracts an unsharp, or smoothed, version of an image from the original image. The term “unsharp” comes from the fact that the kernel combines both an edge detector and blur filter, which results in a more refined sharpening effect. Fewer artifacts are produced, so the technique is usually the preferred way to sharpen images. (Source).

amount = 3
gaussian_r = scipy.ndimage.convolve(r, gaussian(3))
mask = r - gaussian_r
display(r, "Original")
display(r + amount * mask, "Unsharp mask")

Image Convolution