Authors: Youcef Gheraibia; Sohag Kabir; Abdelouahab Moussaoui; Smaine Mazouzi
Addresses: Department of Computer Science, University of Badji Mokhtar, Annaba, 23000, Algeria ' Department of Computer Science, University of Hull, HU6 7RX, Hull, UK ' Department of Computer Science, University of Feraht Abaas Setif, 19000, Algeria ' Department of Computer Science, University of 20 aout 1955 Skikda, 21000, Algeria
Abstract: Classical Huffman code has been widely used to compress biological datasets. Though a considerable reduction of size of data can be obtained by classical Huffman code, a more efficient encoding is possible by treating binary bits differently considering requirement of transmission time, energy consumption, and similar. A number of techniques have already modified the Huffman code algorithm to obtain optimal prefix-codes for unequal letter costs in order to reduce overall transmission cost (time). In this paper, we propose a new approach to improve compression performance of one such extension, the cost considering approach (CCA), by applying a genetic algorithm for optimal allocation of the codewords to the symbols. The idea of the proposed approach is to sacrifice some cost to minimise the total number of bits, hence, the genetic algorithm works by giving penalty on the cost. The performance of the approach is evaluated by using it to compress some standard biological datasets. The experiments show that the proposed approach improves the compression performance of the CCA considerably without increasing the cost significantly.
Keywords: data compression; Huffman code; information coding; genetic algorithm; cost considering approach; CCA; data communication; optimisation.
International Journal of Information and Communication Technology, 2018 Vol.13 No.3, pp.275 - 290
Received: 06 Jun 2015
Accepted: 19 Oct 2015
Published online: 10 Apr 2018 *