FUZZY CODE ON RNA SECONDARY STRUCTURE

In this paper, we developed a fuzzy code technique for molecular phylogenetic analysis. This proposed theory has potential to encode or decode information related to the evolution of sequences traversing from one stage to another in phylogenetic trees. Using this novel methodology we have encoded the sequence of RNA molecule of each species in phylogenetic trees which folds into three-dimensional structure due to transcription, termed as secondary structure.After encoding RNA sequence into the fuzzy code, we wrote mathematical formulation of RNA secondary structure. In addition we establish relation between RNA sequence and their secondary structure. We constructed the fuzzy neural network, fraction of neural neighbour in sequence space for differentiating compatible sequences. We have used technique involution metric, symmetric group, symmetric difference, etc; to establish a difference in secondary structures. AMS Subject Classification: 03E72, 20Bx, 68R10, 20A15, 30Lxx


Introduction
Medicine at the turn of the century is characterised by the deepest change.It has ever been subjected to its history, i.e. its transformation from a healing profession to a branch of biotechnology.Viewed from an evolutionary perspective, these changes appear as an aspect of a Darwin-Lamarckian auto evolution of life on earth.The nucleic acids DNA and RNA as the genetic material of living beings and viruses play the pivotal role in this arena.This can be understood from the following example.Suppose we have a sequence of RNA nucleotide, after chemical changes or mutation take place during evolution from one generation to another, does not give us enough information about which terms of nucleotide sequence get affected.It means there exist uncertain condition in DNA or RNA sequence when they move from one generation to another.
To the above problem, a reliable technique is required, which can handle uncertainty condition.In this study, we present a novel methodology based on fuzzy theory [7].Fuzzy sets were introduced by Lotfi A. Zadeh in 1965 as an extension of the classical notion of set.Fuzzy sets are sets whose elements have degrees of membership in terms of numeric value belong to [0, 1].A polynucleotide of DNA or RNA molecule is a linear polymer that consists of many smaller units called its building block or monomers.Nucleotide of DNA or RNA is transformed into fuzzy sets termed as fuzzy nucleotide or fuzzy code.A Linear polymer of fuzzy nucleotide is termed as a fuzzy polynucleotide.Previous studies have shown that fuzzy theory [22,23] is used as a tool in bioinformatics.In this study, we used the fuzzy concept in phylogenetic analysis.We have encoded a sequence of RNA into fuzzy code in the phylogenetic tree and conversely decoded.We used the fuzzy code as an information preserving tools in phylogenetic analysis, it preserved number of times of mutation of a nucleotide at each position, at each stage.
In this study, we have constructed fuzzy graph [9], where fuzzy nucleotide acts as vertices and interaction between two fuzzy nucleotide [17,18,19] treated as an edge, which is hydrogen bond between them.We have treated RNA secondary structure [8] or RNA fuzzy structure as a fuzzy graph.Each RNA molecule folds into a three dimensional structure, which determines it's biochemical function.After constructing the edges and vertices in a biochemical structure, we studied further some possible aspect of fuzzy polynucleotides space and fuzzy structure space of fixed length 'n' to explore some biochemical property.Folding of RNA polynucleotide [10] or secondary structure is viewed as a map that assigns a uniquely defined base pairing pattern to every sequence.The mapping was non-invertible since many fuzzy polynucleotides folded into the same minimum free energy secondary structure, see for instance [11].We used involution metrics given by Reidys and stadler [5], symmetric group [30], symmetric difference method for predicting the difference between two secondary structure.

Fuzzy nucleotide
DNA and RNA are a linear polymer of nucleotides, and we are referred as polynucleotide.In this section, we represent RNA nucleotide as a fuzzy nucleotide.
Let X = {A, U, G, C} be a non-empty set of nucleotides.Fuzzy nucleotide W in X is characterised by it's membership function, µ W : X → [0, 1] and µ W (x) is interpreted as the degree of membership of element x in fuzzy set W , for each x ∈ X under certain condition: ( It is clear that W is completely determined by the set of tuples, So it can also be written as We can write each fuzzy nucleotide as a four dimensional tuples.Each coordinate represents the degree of membership of particular nucleotide which belongs to set X.Here we fix the position of each nucleotide in coordinate-wise respectively.In which first, second, third and fourth coordinate represent the degree of membership of nucleotide A, U, G and C respectively. Particularly in equation (2.2), if we put So we can represent a nucleotide regarding fixed code.A = (1, 0, 0, 0), U = (0, 1, 0, 0), G = (0, 0, 1, 0), C = (0, 0, 0, 1).
Fuzzy theory is a mathematical concept which comes under the category of uncertainty theory.Due to this characteristic, this theory is useful for handling the degree of presence of nucleotides in fuzzy system.We used fuzzy nucleotide which has the potential to express uncertainty due to the mutation in DNA or RNA sequence traversing from one generation to another.Fuzzy nucleotide express the degree of membership of each nucleotide as shown in equation(2.2).The fixed code is a particular case of fuzzy code, which expresses certainty for each nucleotide.Using this ideology, we encoded and decoded the evolution of RNA sequence under molecular phylogenetic principle.In next section, we focus on the encoding evolution of sequences in fuzzy codes and vice versa.The molecular phylogenetic structure is generated from character datasets that provide evolutionary content and context.Evolution is modelled as a process that changes the state of a character, such as the type of nucleotides (AUGC) at a specific location in an RNA sequence.Each character is a function that maps a set of taxa to distinct states.It means, in the evolution of RNA or DNA sequence there are some stages from ancestral sequence to their descendant sequence.Molecular evolution of RNA sequences in the phylogenetic tree represents a change in nucleotide position in their sequences, while it moving from one species to another species with respect to time.A phylogenetic tree is a graphical representation of the evolution of species which contains different stages as shown in figure (3.1). Figure (3.1)illustrates how a molecular sequence might evolve over time as a result of multiple mutations that results in a small, but evolutionarily important changes in a nucleotide sequence.Such changes over time may eventually modulate the function of the protein within divergent species [28].Molecular evolutionary of RNA sequence at each stage represents the uniqueness of species characteristic, which is our sequence of interest.In figure (3.1) , node 1 is ancestral sequence, node 2 and node 3 are descendants of node 1 , node 5 and node 6 are descendants of node 3 and so on.In this study, we encoded or decoded the uniqueness of sequencing regarding information preserving coding.Note, in this paper we have used RNA sequences as the character data.However, phylogenetic trees can be accurately estimated from many different types of molecular data.w m i can also be obtained by using previous information w (m−1) i under following condition:

Decoding RNA fuzzy code into sequence
Condition:-1.Following method has been applicable for those fuzzy codes which are not fixed codes.
2.Fuzzy code w m i except fixed code, suppose we want to know their nucleotide then it is mandatory to have knowledge about previous fuzzy code at position i, stage (m − 1) i.e w m−1 i .Method: case 1.If it is fixed code, then its nucleotide is trivial.

Validation of Encoding and Decoding in molecular phylogenetic trees
Suppose we have RNA sequence of length 4 moving from Stage 1 to 5 as shown in the table (1).Using equation(3.1),we encode RNA sequence into their fuzzy code, which shown in the table (2).Similarly using equation(3.2),we decode fuzzy code into RNA sequence.
1000 1000 0100 0001 stage 2 1000 1000 0,0.5,0.5,00.5,0,0,0.5stage 3 0.6,0,0.3,00.6,0,0,0.30.3,0.3,0.3,00.3,0,0,0.3stage 4 0.5,0,0.5,00.5,0.25,0,0.25 0.5,0.25,0.25,00.25,0.25,0,0.25 stage 5 0.6,0,0.4,00.4,0.2,0,0.40.4,0.2,0.2,0.2 0.4,0.2,0,0.2table (2) In the table (1), stage 1 represents an ancestral sequence, stage 2 descendant of stage 1 and so on.With the help of encoding technique using equation(3.1),we can preserve mutation information as shown in the table ( 2).In the table (2),each position while moving at stages m = 1 to 5 express changes in their nucleotides {A, U, G, C}.It means each code at position i and stage m tells about their nucleotide changes moving through evolution from one species to another species.By the help of table ( 1)and ( 2), we can write fuzzy code of figure (3.1).Fuzzy code preserves information of their nucleotides mutation at each stage, which is given in equation (3.1).Due to the information preserving property, it helps to determine mutation spots in the evolutionary structure of RNA sequences.This spot can play a better role in the sequence alignment to identify homologies.Suppose we are interested to know about the nucleotide type at any particular stage say m, example: at stage 4 in the table (2).Then by the help of equation(3.2) using the decoding technique, we can find nucleotide sequence.Similarly, we can find in the phylogenetic tree also, if it is written in fuzzy code form.Phylogenetic structure of RNA sequence is one kind of graphical approach.Therefore, it is very obvious, how to analysis the evolution of fuzzy polynucleotide of RNA sequence and their secondary structure with the help of graph theory.In next section, we constructed the fuzzy graph, where fuzzy nucleotide act as a vertex and interaction between two nucleotides acts as an edge.

Fuzzy Graph of RNA sequence
It is quite well known that graphs are simply models of relations.A graph is a convenient way of representing information involving the relationship between objects.The objects are represented by vertices and relations by edges.Using this standard logic, we constructed a fuzzy graph for each fuzzy polynucleotide, where fuzzy nucleotide acted as vertex and interaction between two fuzzy nucleotides acted as edges.In phylogenetic tree of RNA sequences of different species, we characterised sequence in terms of their length of sequence and stage.i.e, V m n is a fuzzy polynucleotide of length n at each stage m.
} is a fuzzy nucleotide position i at stage m which actually express variation of degree of membership of same set of nucleotides X and {a m j } represent degree of membership of particular nucleotide at stage m for each position j ∈ {1, 2, 3, 4}.We assign unique degree of membership of fuzzy nucleotide at position i ∈ (1, 2, 3...n) at stage m, [Note:-w m i (X) = {φ}, it means vacant position.](4.RNA secondary structure is one type of graphical representation.In this section, we defined all preliminary related to graph theory in fuzzy code form.In next section, we are constructing RNA secondary structure using a fuzzy system.variety of secondary structure.In this section, we have constructed the mathematical formulation of secondary structure using the fuzzy system.A fuzzy polynucleotide is a sequence of fuzzy nucleotides or fuzzy codes.A secondary structure [1]of RNA is a fuzzy structure H m n = (V m n , E), where V m n is a fuzzy polynucleotide of length n at stage m in which each fuzzy nucleotide acts as vertice associated with adjacency matrix E = e(z m i , z m t ) 1 i,t n fulfilling three condition:

Fuzzy code on RNA secondary structure
(a) e(z m i , z m i+1 ) → (0, 1] for all 1 i n − 1.(b) For each i, there is atmost one ) and e(z m k , z m l ) be two interaction of fixed codes of nucleotides and i < k < t then i < l < t.
Condition (c) gives guarantee that the fuzzy structure free from knot or pseudoknots.Vertices of the fuzzy polynucleotide are the individual fuzzy nucleotide in the order defined by the RNA sequence of fuzzy polynucleotide V m n , which is a string of length n at stage m over a nucleotide alphabet } constitute the ribose-phosphate backbone of the nucleotides.An edge e(z m i , z m t ) with z m t = {z m i−1 , z m i+1 } is called base pair of the secondary structure.Each z m i which is connected only to it's neighbours in the backbone z m i−1 and z m i+1 is called unpaired nucleotides of fuzzy structure.The number of base pair and unpaired fuzzy nucleotides in fuzzy structure H m n are denoted by n p (H m n ) and n u (H m n ), respectively.Accordingly the chain length of the fuzzy structure, it can also be expressed in terms of base pair and unpaired fuzzy nucleotides, Each RNA fuzzy polynucleotide folds due to base pairing property form RNA fuzzy structure (secondary structure).So there is a one-one correspondence relation between a fuzzy polynucleotide of length n at stage m and their fuzzy structure as defined above.But now there is a need which requires extension of this conventionally consider one-one correspondence relation between them, because folding of RNA fuzzy polynucleotide into fuzzy structure is viewed as a map that assign a unique defined base pairing pattern (AU, U A, GC, CG, U G, GU ) to every fuzzy polynucleotide.It is worth to point out that the relation between the fuzzy polynucleotide and fuzzy structure is introduced only via the pairing rules.
The fuzzy structure is characterised by the only pairing scheme such that e(z m i , z m t ) = 1 and t = i − 1, i + 1}, (5.3)where [w m i , w m t ] represent only i th , t th position of fuzzy nucleotide at stage m.Two fuzzy structure(secondary structure) are considered as the same class, if and only if their base pairing position is same and independent of base nucleotide which belongs to any one of set (AU, U A, GC, CG, GU, U G).Similarly, a set of the fuzzy structure is considered into the same class if and only if their base pairing position is same and independent of base nucleotide which belongs to any one of set (AU, U A, GC, CG, GU, U G).We select one from this class which has minimal free energy level.Through the following condition, we convert the one-one correspondence map into many-one map between fuzzy polynucleotide space and fuzzy structure space.Therefore, a mapping is noninvertible since many sequences fold into the same minimal free energy fuzzy structure.Keeping this aspect in mind, in next section we established many-toone relation between RNA fuzzy polynucleotide space and their fuzzy structure space instead of conventionally consider the one-one correspondence relation between them.

RNA fuzzy polynucleotide space and their structure space
RNA fuzzy polynucleotide space is the collection of all evolutionary stage of RNA fuzzy polynucleotide in phylogenetic tree with constant chain length n at each stage m ∈ M over the nucleotides X is a generalised hyperspace denoted by Q M n of dimension 4n, because each fuzzy nucleotide has dimension 4. Each fuzzy polynucleotide folds in itself due to the base pairing property form fuzzy structure.The collection of all fuzzy structure is called fuzzy structure space is denoted by F M n The mapping from fuzzy polynucleotide space onto fuzzy structure space is many-to-one.We deal model of mapping with the pre-images of particular fuzzy structure in fuzzy polynucleotide space; these are fuzzy neural networks which a subspace of Q M n .Since subspace of fuzzy polynucleotides fold into minimal free energy fuzzy structure.The fuzzy structure is characterised by same base pairing position (from equation 5.3).So, we took minimal free energy fuzzy structure say H m 1 n , where m 1 ∈ M .
A subspace of Q M n (X) is compatible with fuzzy structure H m 1 n if condition of equation (5.1) is fulfilled for all π(H m 1 n ) (equation (5.3)).In other words, the fuzzy nucleotides at i th and t th position of compatible fuzzy polynucleotide are capable to form base pair, when pair [w The set of all compatible fuzzy polynucleotides is denoted by D(H m 1 n ).The cardnality of this fuzzy polynucleotides which are compatible with fuzzy structure H m 1 n is given by, where Consider a combinatory map where F M n denote the space of all RNA fuzzy structure which formed by the Q M n through base pairing property.Since mapping is many-one.Pre-image of H m 1 n , f −1 (H m 1 n ) which consists of all fuzzy polynucleotides folding into the fuzzy structure H m 1 n , is contained into compatible fuzzy polynucleotides.Let H m 1 n be a fuzzy structure, then the subspace of Q M n compatible with H m 1 n denoted as C(H m 1 n ) is given by, , where Y ⊆ M. (5.6) Two fuzzy polynucleotides belong to D(H m 1 n ) are neighbour if they differ either : • In a single position which is unpaired fuzzy nucleotide in fuzzy structure [2] if and only if their fuzzy structure has the same number of unpaired fuzzy nucleotides and base pair fuzzy nucleotides with respect to position.
The fuzzy neural network is modelled as a subspace of fuzzy polynucleotide space to fulfil a complex frame for the derivation of analytical result that can be used as a reference for a fuzzy neural network of RNA [3,4].The construction of fuzzy network model are the set of compatible fuzzy polynucleotides of a given fuzzy structure H m 1 n .Fuzzy Polynucleotide is selected at randomly from a set of compatible fuzzy polynucleotides, but unpaired fuzzy nucleotides and base pair fuzzy nucleotides were distinguished.There were two elementary moves of compatible fuzzy polynucleotides, they were base exchange for unpaired fuzzy nucleotides and base exchanges for paired fuzzy nucleotides.Each fuzzy polynucleotide V m n has a certain number of fuzzy neural neighbours u m i in the unpaired base exchange neighbourhood and p m l in the paired base exchange neighbourhood.
The fraction of neural neighbour of length n moving from stage m to m ′ or m ′ to m within compatible polynucleotides of the fuzzy structure H m 1 n is given by, T otal number of unpaired mutation moving f rom stage m to m ′ (X − 1)n u and (5.7) Averaging over all polynucleotides of the fuzzy neural network which is the set of compatible fuzzy polynucleotides of a given fuzzy structure H m 1 n is given by, where Y = cardinality of fuzzy neural network which is subset of set M .In this section, we established a many-one mapping between RNA fuzzy polynucleotide space and fuzzy structure space.The difference between fuzzy polynucleotide space and fuzzy structure space are due to base pairing property.The difference between two fuzzy polynucleotide are due to the difference in unpaired bases with respect to position and difference between two fuzzy structure are due to WatsonCrick and GU wobble pairs with respect to position.We analysed fuzzy neural network of compatible fuzzy polynucleotide sequences, which is a collection of the evolution of primary sequences of some stages Y ⊂ M in fuzzy code form, which folds into minimal free energy fuzzy structure.We developed an equation to calculate the fraction of neural neighbour between two fuzzy polynucleotides at different stages having fixed length within the fuzzy neural network which mapped to minimal free energy fuzzy structure.In next section, analysed RNA fuzzy structure space, which forms through fuzzy polynucleotide space using WatsonCrick and GU wobble pairs.even and [w x 1 , w (k, l) (f rom equation (5.9)) (5.10) then, Hence proved.In this section, we analysed the difference between two RNA fuzzy structure in fuzzy structure space taking help involution metric space, symmetric group, symmetric difference, etc.Here, we established one-one correspondence relation between base pair in the fuzzy structure using symmetric difference method and transposition in symmetric group, because both represent abstractly similar things, One represents base pair of fuzzy nucleotides in their structure and another represent pair of fuzzy nucleotide position termed as transposition in symmetric group.Through this logic in proposition(5.2.2), we calculated difference between two fuzzy structure via two independent methods and establish a relation between them.

Discussion
In this study, we converted nucleotide into ordered fuzzy set and sequence of nucleotides or polynucleotide as fuzzy polynucleotide.In this study, the fuzzy code is the degree of membership of fuzzy set derived from equation(2.2).Evolution occurs through various genetic events, including transversion substitution, transition substitution, recombination, insertion, deletion etc. RNA sequences are frequently used in constructing molecular phylogenetic.Molecular evolutionary of RNA sequence at each stage represents the uniqueness of species characteristic, which was our sequence of interest.Through the following aspect, we wrote unique fuzzy code or encode for each RNA sequence as shown in figure (3.1) with the help of table (1) and table (2).Thus, we have represented each node of phylogenetic trees in terms of fuzzy code.Fuzzy code preserved information of their nucleotides mutation at each stage, which mathematically expressed in equation (3.1,3.1.1,3.1.2).Due to the information preserving property, it helps to determine mutation spots in the evolutionary structure of RNA sequences.This spot can play a better role in the sequence alignment to identify homologies.In reverse, with the help of the decoding technique, we determine the RNA sequence at any stage from their fuzzy code using equation (3.2).
Also, we have constructed a fuzzy graph for the secondary structure representation, where fuzzy nucleotide acted as vertices derived from equation (4.1) and interaction between two fuzzy nucleotides treated as edge , which is hydrogen bond between them.We treated the collection of all encoded fuzzy code of phylogenetic structure as a fuzzy polynucleotide space denoted by Q M n .Each fuzzy polynucleotide folds because of WatsonCrick and GU wobble base pairs property to form fuzzy structure(secondary structure).We took collection of all fuzzy structure termed as fuzzy structure space denoted by F M n .We developed fuzzy model of secondary structure derived from equation (5.1).Folding of RNA fuzzy polynucleotide into fuzzy structures is viewed as a map that assigns a uniquely defined base pairing pattern to every polynucleotide derived from equation (5.3).As it is conventionally understood for one sequence, we got one secondary structure due to base pair property.We considered manyone mapping instead of conventionally consider one-one mapping because many sequences fold into the same minimum free energy fuzzy structure.
We considered preimage of fuzzy structure as a fuzzy neural network, which is a subspace of fuzzy polynucleotide space.We developed a method for determining the cardinality of fuzzy neural network using equation (5.6), fraction of neural neighbour between two fuzzy polynucleotides using equation (5.7) and average neural neighbour of fuzzy neural network using equation (5.8), which is a subspace of fuzzy polynucleotide space.We analysed the difference between two RNA fuzzy structure in fuzzy structure space through involution metric space, symmetric group, symmetric difference etc; and established a relation between them, which is broadly explained in proposition(5.2.1) and (5.2.2).
2.) Let e(w m i , w m t ) = interaction between two fuzzy nucleotide w m i and w m t , which is symmetric fuzzy relation on V m n × V m n = interaction between their decoded nucleotide =e(z m i ,z m t ) [from equation(3.2)] .Mathematically written as, e : z m i × z m t → [0, 1] for all i = t ∈ {1, 2, 3...n} However, complementary bases A − U ,G − C and G − U form stable base pairs with each other using hydrogen bonds.Due to this base pairing property, the RNA sequence forms secondary structure.
From Proposition 5.2.1:d inv = n(P (H m n )P (H m ′ n )), where n(P (H m n ) P (H m ′ n ))is the least number of transposition.But in this proposition we are going to compute d inv explicitly.