Developing an Approach to Compress Non-Repetitive Codecs of DNA Using a Novel NDCP: A Lossless Utility

Authors

  • V.Hari Prasad Department of Technical Education, Andhra Pradesh, India

DOI:

https://doi.org/10.53555/nncse.v3i4.425

Keywords:

compression, coding, decoding, bio compress, Huffbit compress, dnabit compress, LSBD compression

Abstract

The transformation has been started with Information Theory in the field of Data compression. The outcome of data compression is a technological explosion in internet technology, now the continent is enjoying. Initially the spark of compression has been welded with Text (Lossless) compression and later it has been speeded to the allied areas like Geneti (DNA&mRna) and multimedia data compression (Lossless&Lossy). In 2004 the Human Genome project was deciphered. Can you imagine the human genome requires in an around of 30-35 GB for storage and maintenance. If at all if would have to maintain census data bases the infrastructure would requires substantially larger. So the remedy is compression of genetic sequences. Due to arrival of  different DNA sequences the public genetic databases size growing like in an exponential manner. To limit it state of the art many DNA compression algorithms were strived into the compression era, but they work with common performance analysis of best, worst and avg cases based on repetitiveness of the DNA sequences. In case if DNA contains many non frequent fragments(non-codecs) the existing techniques may run in worst case. A new methodology is highly inevitable for non codec’s. In this work a Lossless novel utility Tool NDCP (Non-codecDNAcompression) was proposed to delimit the feasible compression ratios of existing compression techniques.

References

E Schrodinger. Cambridge University Press: Cambridge, UK, 1944.[PMID: 15985324]

R Giancarlo et al. A synopsis Bioinformatics 25:1575 (2009) [PMID:19251772]

EV Koonin. Bioinformatics 15: 265 (1999)

JC Wooley. J.Comput.Biol 6: 459 (1999) [PMID: 10582579]

CH Bennett et al. IEEE Trans.Inform.Theory 44: 4 (1998)

S Grumbach & F Tahi. Journal of Information Processing and Management 30(6): 875 (1994)

E Rivals et al. A guaranteed compression scheme for repetitive DNA sequences. LIFL, Lille I University, technical report IT-285 (1995)

X Chen et al. A compression algorithm for DNA sequences and its applications in Genome comparison. In Proceedings of the Fourth

Annual International Conference on Computational Molecular Biology, Tokyo, Japan, April 8-11, 2000. [PMID: 11072342]

TC Bell et al. Newyork:Prentice Hall (1990)

J Ziv & A Lempel. IEEE Trans. Inf. Theory 23: 337 (1977)

A Grumbach & F Tahi. In Proceedings of the IEEE Data [12]

DNA compression is challenge is revisited Beshad Behajadi

Allam AppaRao.In proceedings of the Bio medical Informatics Journal [2011].DNABIT compress-compression of DNA sequences

Allam AppaRao.In proceedings of the JATIT journal computationalf Biology and Bio Informatics:[2009].HuffBit compress-compression of DNA using extended binary trees.

Allam AppaRao.In proceedings of the JATIT journal computational Biology and Bio Informatics:[2011].Genbit compress-compression of DNA sequences.

Edries Abdelhadi In proceedings of the IJCA journal of computer applications[2010]: An efficient horizontal and vertical method for

online DNA sequance compression.

Srinivasa K G,Jagadish M, Venugopal K R and L M Patnaik “Efficient compression of non repetitive DNA sequances using Dynamic

programming “ pages 569-574 IEEE 2006.

Downloads

Published

2016-04-30

How to Cite

Prasad , V. (2016). Developing an Approach to Compress Non-Repetitive Codecs of DNA Using a Novel NDCP: A Lossless Utility. Journal of Advance Research in Computer Science & Engineering (ISSN 2456-3552), 3(4), 10-15. https://doi.org/10.53555/nncse.v3i4.425