Sequence Alignment System

From GM-RKB
Jump to navigation Jump to search

A Sequence Alignment System is a Range-type String Matching System that can solve a sequence alignment task by implementing a sequence alignment algorithm.



References

2009

  • (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Sequence_alignment
    • In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns.

      Sequence alignments are also used for non-biological sequences, such as those present in natural language or in financial data.

2002

  • (Navarro & Raffinot, 2002) ⇒ Gonzalo Navarro, and Mathieu Raffinot. (2002). “Flexible Pattern Matching in Strings." Cambridge University Press.
    • QUOTE: Sequence comparison is about determining similarities and correspondences between two or more strings. It is related to approximate searching (Chapter 6) and has many applications in computational biology, speech recognition, computer science, coding theory, chromatography, and so on. These applications look for similarities between sequences of symbols. The general goal is to perform basic operation over the strings until they become equal.

      ... A concept of "distance" between two strings can be defined according to the minimum cost of making them equal.

      ... In some cases it is useful to measure the degree of similarity rather than of dissimilarity (i.e., a distance). One example is the LCS, a heavily studied measure. Other examples are the shortest common supersequence (SCS), longest common substring (LCG, different from LCS because the common string has to be a contiguous substring of both sequences), and shortest common superstring (SCG), as well as their version or more than two strings.

1974