University of South Wales DSpace >
University of South Wales >
PhD theses >
|???metadata.dc.title???: ||The construction of DNA codes using a computer algebra system|
|???metadata.dc.contributor.*???: ||Aboluion, Niema Ali|
|???metadata.dc.subject???: ||Genetic engineering|
|???metadata.dc.identifier.citation???: ||Aboluion,A. M. (2011)'The construction of DNA codes using a computer algebra system'. Unpublished Ph.D. thesis. University of Glamorgan.|
|???metadata.dc.description.abstract???: ||Coding theory has several applications in Genetics and Bioengineering. This
thesis concentrates on a specific application from Computational Biology. This
concerns the construction of new DNA codes which satisfy certain combinatorial
constraints, using an alphabet of four symbols. The interest in these codes
arises because it is possible to synthesise short single strands of DNA known as
oligonucleotides. The codes can be useful in the design of these oligonucleotides.
For example, the codes are used in DNA computing, as bar codes in molecular
libraries and in microarray technologies.
The computer algebra system Magma, which deals successfully with coding
theory computation, is applied initially to the construction of DNA codes sat-
isfying a GC-content constraint and a minimum Hamming distance constraint.
The constraints are specified to avoid unwanted hybridizations and to ensure
uniform melting temperatures. Additionally, another constraint, known as a
reverse-complement constraint, is added to further prevent unwanted hybridiza-
tions. This additional constraint is studied using involutions in a permutation
group. Codes constructed in this thesis are derived from linear codes over GF(4)
and Z4 and additive codes over GF(4). Previous approaches to the construction
of these codes are extended in several ways. Longer codes are constructed, the
examination of cyclic and extended cyclic codes is more comprehensive, and
cosets of codes are considered. In addition, attention is paid to the mapping
from field or ring elements to the DNA nucleotides; different mappings can give
different lower bounds. Further improvements have been made after the tech-
niques of shortening and puncturing are applied to the table of best codes, and
also by searching for codes in the tables that have an all-ones vector in their dual.
The use of a database of best known linear codes is also considered. In many
cases codes are obtained which are larger than the best codes currently known.
In the case of codes of length greater than twenty, linear DNA codes have not
been constructed previously and so all codes obtained are the best known re-
sults. Generator polynomials are given for the codes constructed. Coset leaders
are also given in cases where cosets of linear codes are used. Thus it is possible
for the reader to construct the codes without repeating the work presented in
the thesis. Additionally, files of codewords are available online when the codes
constructed are the best known codes and have fewer than 50000 codewords.|