# Power optimised hybrid sorting-based median filtering

# N. Sambamurthy\* and M. Kamaraju

ECE Department, Gudlavalleru Engineering College, JNTUK, Gudlavalleru, India Email: sambanaga009@gmail.com Email: profmkr@gmail.com \*Corresponding author

**Abstract:** Nowadays, embedded video and image processing capabilities are much more demands with image quality. Digital image noise mostly occurs in a communication channel. The variety of random variation of white and black dots on the surface of an image, seriously degrading the image quality. Median filters are having the excellent image denoising processing capabilities. These filters are particularly reducing the salt-and-pepper noise and increase the throughput with less complexity. In contrast, power efficient remains an untapped area for improvement in the sorting-based network. For this, the filter is designed with hybrid sorting network with intelligent clock gating technique is also presented. The implementation of median filter on ARTIX-7 (90 nm) FPGA. The practical results shows the effectiveness of combined parallel and pipelined with clock gating architecture reduces the dynamic power and complexities in terms of FPGA resource usage and frequency.

**Keywords:** median filter; noise reduction; FPGA; parallel processing; pipeline technique; clock gating; sorting network.

**Reference** to this paper should be made as follows: Sambamurthy, N. and Kamaraju, M. (2020) 'Power optimised hybrid sorting-based median filtering', *Int. J. Digital Signals and Smart Systems*, Vol. 4, Nos. 1/2/3, pp.80–86.

**Biographical notes:** N. Sambamurthy obtained his BTech from the JNTU Hyderabad and MTech from Gudlavalleru Engineering College, Gudlavalleru, and presently pursuing his PhD from JNTUK, Kakinada in the research area of VLSI Architecture Design. His areas of interest are microcontrollers, embedded system design, microprocessors and VLSI architecture design. He is a member of the IE. He is presently working as an Assistant Professor in the ECE Department, Gudlavalleru Engineering College, Gudlavalleru.

M. Kamaraju obtained his BE and ME from the Andhra University and PhD from JNTUH, Hyderabad, in the area of Low Power VLSI Design. His areas of interest are microprocessors, microcontrollers, digital system design, embedded system design and low power VLSI design. He published 74 technical papers in national and international journals and conferences. He reviewed number of papers for international journal and conferences. He is a Fellow of the IETE, IE and a member of IEEE. He is presently working as a Professor and a mentor of the ECE Department, Gudlavalleru Engineering College, Gudlavalleru, India. He is the past Chairman of the IETE and IE State Wide Centre, Andhra Pradesh. He is also an executive counsellor member of the JNTUK, Kakinada.

#### 1 Introduction

Rank order filters or median filters are nonlinear digital filters and these are suitable for reducing impulsive noise without destruction of edge information. Median filters are both recursive and non-recursive type filters. In recursive type, window consists of the recent median values as well as the sampled values of the image, while in non-recursive type, sampled values of the image only considered. Recursive type median filters are providing excellent noise reduction with considerably less blurring than linear filters.

FPGA's are flexible, reconfigurable and computationally intensive. The serious common problem in image processing algorithm is addition of noise in the communication channel. For reducing of noise researchers provides several solutions. Unlike linear filtering, median filters reduce the high frequency and impulsive noise without adverse effect to the edges.

# 2 Related work

Generally, the acquired video data are usually affected by different types of noise components (Healey and Kondepudy, 1994; Tenze et al., 1999; Amer and Schroder, 1996). During broadcasting of video images Gaussian distribution noise and impulsive noise models are important for analysing quality of images among various noise suppressible spatial domain filtering methods median filters are dominant one (Wang, 2013). The median filter substitutes a pixel sample with the median value among all window samples. Generally, finding the median value is based upon either word level or bit level (Benkrid et al., 2002). In this paper, we focussed on word level architecture that enables power optimised filter design for real time applications (Prokin and Milan, 2010).

The proposed VLSI design of sorting algorithm-based median filter on FPGA hardware enables better performance in terms of low power and high speed. Pei (2010) suggested VLSI architecture design of simple edge preserved denoising method for reducing impulse noise. Shih (2013) designed low cost VLSI-based image scaling processor using bilinear interpolation technique for image smoothing applications, but this design problem with aliasing and blurring issue.

Vasanth and Karthik (2010) investigated an area efficient median filter design using decomposition algorithm. This algorithm alleviates the complexity issues like slices and LUT's for VLSI implementation.

KalaiPriya et al. (2011) implemented FPGA-based nonlinear median filter along with high pass filter. The output of median filter is connected to the high pass filter where the signal image to be filtered and output of high pass filter that recognises and distinguishes the high frequency components. The entire design process takes several milliseconds and the design is not met real time constraints. Burian et al. (2003) developed a VLSI hardware-based low cost and area optimised median filter (Yin et al., 1996). The design timing constraints are not suitable for high resolution images.

# 3 Proposed system

The median filter is superior at preserving sharp edges than other filter architectures (Richards, 2012).

This filter stored all incoming pixels from line buffers and replacing the considerable middle pixel value of  $3 \times 3$  window.

For example, assuming  $3 \times 3$  window of pixels centre around a pixel of value is 229, with the following.

Figure 1 Pixels arranged in matrix form

| 29 | 38  | 224 |
|----|-----|-----|
| 55 | 229 | 244 |
| 41 | 61  | 75  |

Note: 8-bit fixed point format.

The pixels are arranged in matrix form and rank the pixels to obtain the sorted list and stored in BRAM as follows: 29, 38, 41, 55, 61, 75, 224, 229 and 244. The median value is the 61 in the output image.

FPGA-based median requires (Mu et al., 2013; Vivado Design Suite User Guide: High-Level Synthesis, 2013) sorting network, for arranging the window-based pixels either in ascending order or descending order. In the literature analyse the complexity of different sorting methods such as bubble sorting, merge sort and bitonic sorting search. Usually, all these techniques are performing only one comparison at a time. If N samples are to be ordered, by using these algorithms, the computational complexity is predominantly increased. This in itself seems like a good reason to search for an alternative approach.

In image processing, hybrid sorting-based networks offer a way to achieve good parallelism pipelining and fast running time on FPGA's. The compare and swap unit is the important block in hybrid sorting. The compare unit compares two 8-bit samples and their high and low outputs performing a swap operation if necessary.

The advantage of designed hybrid sorting network is compare and swap units are fixed for nine samples. If sample rate changes the number of compare and swap modules either increased or decreased.

#### 4 Optimised hybrid sorting algorithm

Algorithm for N × N pixel-based sorting:

#load the image samples and sends to sorting network
#MIN(x, y) ((x) > (y) ? (y): (x))
#MAX(x, y) ((x) > (y) ? (x): (y))
Char i, k, t, z;
// input data stored:
For (i = 0; i < K\*K; i++) z[i] = window[i]; // K\*K window size;</pre>

## // sorting

```
For (comp_stage = 1; comp_stage <= N; comp_stage++)
{
If ((comp_stage %2) == 1) k = 0;
If ((comp_stage %2) == 0) k = 1;
For (i = k; i < N - 1; i = i + 2)
{
t[i] = MIN(z[i], z[i + 1]);
t[i + 1] = MAX(z[i], z[i + 1]);
z[i] = t[i];
z[i + 1] = t[i + 1];
}
</pre>
```

Figure 2 Combined parallel-pipelined-based hybrid sorting network without CG



The algorithm is compatible for changing the window size from  $3 \times 3$  to  $5 \times 5$  and  $7 \times 7$ . Number of samples increases from 9 to 25 and 49. But hardware complexity is increases in terms of compare and swap units. In this paper, we focus on sorting network with different window size and clock gating (CG) technique is adopt to the sorting network.



Figure 3 Compare and swap network

Figure 4 Logic circuitry for CG network



Note: Associated with Figure 2.

In sorting, network compare and swap unit maintains equal delay and total delay of nine samples equal to nine clock cycles. For each completion of window size, the parallel

output contains sorted data and it is stored in BRAM consecutive memory locations to produce the filtered output. The designed sorting network optimised by two techniques. The first is to pipeline the whole network of required stages and image samples are in order to meet the FPGA clock rate with pixel clock rate. The second optimisation is reshape the way of reading window-based pixels in to pipelined registers, thus improving the data accessing in parallel at once.

Figure 4 shows that the contact of CG by extra logic feed the select line of multiplexer. As a result, clock (ce) signal on gating flop enables one of the registers and also select the Mux in the next cycle. Therefore, eight of nine registers are deselected during that sequential clock cycle.

#### 5 Results and discussion

The designed hybrid sorting-based median filter has been coded using verilog description language to synthesise in the targeted ARTIX-7 FPGA using Xilinx Vivado Tool (*Zynq-7000 All Programmable SoC: Technical Reference Manual*, 2013). Table 1 shows that CG-based median filter utilise more number of resources for different window sizing.

| Device utilisation report on Xilinx XC7A35T-1FTG256 Artix-7 FPGA |                      |              |                   |       |              |              |  |  |  |
|------------------------------------------------------------------|----------------------|--------------|-------------------|-------|--------------|--------------|--|--|--|
| Device utilisation -                                             | Without clock gating |              | With clock gating |       |              |              |  |  |  |
|                                                                  | 3 × 3                | $5 \times 5$ | 7 × 7             | 3 × 3 | $5 \times 5$ | $7 \times 7$ |  |  |  |
| Slices per logic                                                 |                      |              |                   | 3     | 5            | 8            |  |  |  |
| LUT'S                                                            | 457                  | 3,673        | 14,257            | 467   | 3,683        | 14,267       |  |  |  |
| Flip Flops                                                       | 656                  | 5,424        | 21,216            | 667   | 5,437        | 21,227       |  |  |  |
| BRAM's                                                           |                      |              |                   |       |              |              |  |  |  |
| DSP48E                                                           |                      |              |                   |       |              |              |  |  |  |

 Table 1
 Device utilisation summary

Table 2 shows the power consumption on FPGA-based median filter comprising  $3 \times 3$ ,  $5 \times 5$  and  $7 \times 7$  have been designed by using hybrid sorting-based network. Xilinx xpower tool used to calculate the dynamic power consumption of each of the sorting network. Observe that due to CG, the data rate on pixel values x0 to x9 decreases the. In case of  $3 \times 3$ ,  $5 \times 5$  and  $7 \times 7$  window sizes meets target clock period of 82.4 MHz.

Table 2Comparison of CG and without CG

| Dynamic power and critical path delay profile of implemented Median filter |                      |       |                   |                     |              |              |  |  |  |
|----------------------------------------------------------------------------|----------------------|-------|-------------------|---------------------|--------------|--------------|--|--|--|
| Parameter                                                                  | Without clock gating |       | With clock gating |                     |              |              |  |  |  |
|                                                                            | 3 × 3                | 5 × 5 | 7 × 7             | <i>3</i> × <i>3</i> | $5 \times 5$ | $7 \times 7$ |  |  |  |
| Dyanamic power (mw)                                                        | 109.4                | 170.8 | 248.45            | 55.45               | 108          | 125.7        |  |  |  |
| Critical path delay (ns)                                                   | 3.37                 | 2.93  | 2.1               | 3.39                | 2.98         | 1.98         |  |  |  |

## **6** Conclusions

The designed power optimised median filter with sorting network is presented with window sizes of  $3 \times 3$ ,  $5 \times 5$  and  $7 \times 7$ . The median filter is presented with small area on FPGA with a clock frequency of 82.16 MHz and observes that due to increasing the window size clock frequency slightly decreased still meets the real time requirement. CG technique used to minimise the switching activity on fully parallel pipelined hybrid sorting network shows nearly 25% of reduction in dynamic power, and latency respectively 9, 25 and 49 clock cycles.

# References

- Amer, A. and Schroder, H. (1996) 'A new video noise reduction algorithm using spatial subbands', in Proc. IEEE Conference on Electronics, Circuits and Systems, October, Vol. 1, pp.45–48.
- Benkrid, K., Crookes, D. and Benkrid, A. (2002) 'Design and implementation of a novel algorithm for general purpose median filtering on FPGAs', *Circuits and Systems, IEEE International Symposium*, Vol. 4, No. 6, pp.425–428.
- Burian, A., Takala, J. and Ylinen, M. (2003) 'Design and implementation of a median filtering algorithm using the median cost function', *Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003*, Vol. 2, pp.961–965.
- Healey, G.E. and Kondepudy, R. (1994) 'Radiometric CCD camera calibration and noise estimation', *IEEE Trans. on Pattern Analysis and Machine Intelligence*, Vol. 16, No. 3, pp.267–276.
- KalaiPriya, O., Ramasamy, S. and Ebenezer, D. (2011) "Vlsi implementation of nonlinear variable cutoff high pass filter algorithm', 2011 3rd International Conference on Electronics Computer Technology, Kanyakumari, pp.275–278.
- Mu, W., Jin, J., Feng, H. and Wang, Q. (2013) 'Adaptive window multistage median filter for image salt-and-pepper denoising', in Proceedings of 2013 IEEE International Instrumentation, and Measurement Technology Conference (I2MTC), May, pp.1535–1539.
- Pei, Y.C. (2010) 'A low-cost VLSI implementation for efficient removal of impulse noise', *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 18, No. 3, pp.1540–1547.
- Prokin, D. and Milan, P. (2010) 'Low hardware complexity pipelined rank filter', *IEEE Transactions on Circuits and Systems II: Express Briefs*, Vol. 57, No. 6, pp.446–450.
- Richards, D.S. (2012) 'VLSI median filters', *IEEE Trans. Acoust., Speech, Signal Process.*, Vol. 38, No. 1, pp.145–153.
- Shih, L.C. (2013) 'VLSI implementation of a low-cost high-quality image scaling processor', *IEEE Transactions on Circuits and Systems, Express Brief*, Vol. 60, No. 1, pp.31–35.
- Tenze, L., Carrato, S., Alessandretti, C. and Olivieri, S. (1999) 'Design and real-time implementation of a low cost noise reduction video system', *in Proc. COST 254 Workshop on Intelligent Communication Technologies and Applications*, May, pp.36–40.
- Vasanth, K. and Karthik, S. (2010) 'FPGA implementation of modified decomposition filter', 2010 International Conference on Signal and Image Processing, Chennai, pp.526–530.
- Vivado Design Suite User Guide: High-Level Synthesis (2013) UG902 (v2013.2), June.
- Wang, L. (2013) 'A new fast median filtering algorithm based on FPGA', IEEE Trans. on Electronics, Circuits and Systems, December, Vol. 3, pp.55–58.
- Yin, L., Yang, R., Gabbouj, M. and Nuevo, Y. (1996) 'Weighted median filters: a tutorial', *IEEE Transactions on Circuits and Systems II, Analog and Digital Signal Processing*, March, Vol. 43, No. 3, pp.157–192.
- Zynq-7000 All Programmable SoC: Technical Reference Manual (2013) UG585 (v1.6.1), September.