Image Processing

Published May 19, 2022 - By Marco Garosi

Images can be processed in various ways. Sometimes you just want to enhance them; some other times, you want to extract features. In order to achieve your goal, you should know and understand what kind of elaborations can be made, how they work and how they affect the image.

Now, there are three different kinds of image processing techniques:

pixel-based, which mainly work on the contrast;
local;
global.

You can find an introduction to signals here. This post is part of a series on Image and Signal Processing. If you are looking for Images and Statistics, you may read this note.

Pixel-based filtering

There are a bunch of different pixel-based filters you may want to apply to an image. Here are the most important ones.

Identity

It does literally nothing: for each pixel $(x, y)$ it gives back its exact value.

Negative

It inverts the image. Every pixel gets a new value — call it $s$ — which is $s = L - 1 - r$ , where $L - 1$ is the maximum level of grey (usually $256 - 1 = 255$ ).

Clamping

It is defined as follows:

T(r) = \begin{cases} a &\text{if } r \lt a \\ r &\text{if } a \le r \le b \\ b &\text{if } r \gt b \end{cases}

It “cuts” the left-most and right-most parts of the histogram by moving all the values outside of the domain $[a, b]$ into the domain itself (by replacing them with $a$ when they’re lower — on the left-most part of the histogram; and with $b$ when they’re higher — on the right-most part of the histogram).

Stretching

It stretches the image from a range $[r_{min}, r_{max}]$ to a new range $[a, b]$ . The function that does that is:

T(r) = [\frac{r - r_{min}}{r_{max} - r_{min}}](b - a) + a

Of course, the stretching is “linear”, meaning that the produced histogram looks like a straight line for the extended part.

Logarithmic transform

That is defined as: $s = c \ln (1 + r), r \in [0, L-1]$ . $c$ is called the normalization constant and ensures us to map the output on the range $[0, L-1]$ , without ever exceeding it.

Constant $c$ can be computed as $c = \frac{L-1}{\ln L}$

Power transform

The output value $s$ is computed as follows: $s = c r^{\gamma}$ . If:

$r \lt 1$ , the transform is equivalent to the logarithmic transform (which expands lower values and compresses higher ones);
$r \gt 1$ , the transform does the opposite (it expands higher values and compresses lower ones).
$r = 1$ , the transform does nothing -- it is equivalent to the identity.

Binarization

Binarizations works on the assumption that there are two regision to split up. It produces a black and wite image: pixels are either white (255) or black (0), with no intermediary values. One of the most used methods is Otsu’s Method.

Otsu’s thresholding tries to minimize the intra-class variance: it basically tries to binarize at each value from 0 up to 255 and keeps the one that better minimizes intra-class variance.

In order for Otsu’s thresholding to work the best, the histogram should have two bell-shaped groups of values differently skewed — it is thus possible to find a value between them that best divides them.

Otsu’s technique actually minimizes the following formula:

\sigma_w^2(T) = W_0(T) \sigma_0^2(T) + W_1(T) \sigma_1^2(T)

Where $W_0(T) = \sum_{r = 0}^{T - 1} p(r)$ and $W_1(T) = \sum_{r = T}^{L - 1} p(r)$ .

It loops over all the possible values for $T$ ( $T \in [0, L-1]$ , where $L$ is usually $256$ ) and returns $T_{\text{best}} = \text{arg} \min \sigma_w^2(T)$ .

Histogram equalization

Histogram equalization makes the contribution of each grey level similar, thus making the histogram look as uniform as possible (the histogram is said to have the highest entropy).

It is an algorithm, actually: it cannot be computed pixel-independentely since it “spreads” the values on the whole histogram — and does so by taking into account how many pixels there are for each different grey value.

Algorithm

Compute the $L$ cumulative sums $\sum_{j = 0}^{k} p_r(r_j)$ , with $k \in [0, L-1]$ .
Multiply each of the previously computed cumulative sums by the highest available grey level ( $L - 1$ ).
Normalize the values by dividing by $M N$ .
Apply the mapping produced.

Example

Let’s take a simple histogram to make an example. The highest grey level (GL) is $7$ , so there are $8$ different grey levels.

GL	0	1	2	3	4	5	6	7
#pixel	10	8	9	2	14	1	5	2

Step 1: cumulative sum

GL	0	1	2	3	4	5	6	7
cumulative sum	10	18	27	29	43	44	49	51

Step 2: multiply by $L - 1 = 8 - 1 = 7$

GL	0	1	2	3	4	5	6	7
multiplication	70	126	189	203	301	308	343	357

Step 3: normalization

GL	0	1	2	3	4	5	6	7
normalization	70/51	126/51	189/51	203/51	301/51	308/51	343/51	357/51
rounded value	1	2	4	4	6	6	7	7

Step 4: apply the mapping

For each grey level in the previous table a rounded value was produced: that’s the value to substitute for the corresponding grey level. The table is indeed a Look-Up Table (LUT) that maps grey levels to new values. Of course, the LUT mapping has to be applied to every pixel in the image.