Shortest Common Superstring
- AKA: SCS.
- See: Subsequence, Common Subgraph Task, Sequence Alignment Task, Longest Common Substring(LCG), Shortest Common Superstring(SCG), Approximate String Matching.
- Find the shortest string that contains two or more strings as substrings.
- (Navarro & Raffinot, 2002) ⇒ Gonzalo Navarro, and Mathieu Raffinot. (2002). “Flexible Pattern Matching in Strings." Cambridge University Press.
- Sequence comparison is about determining similarities and correspondences between two or more strings. It is related to approximate searching (Chapter 6) and has many applications in computational biology, speech recognition, computer science, coding theory, chromatography, and so on. These applications look for similarities between sequences of symbols. The general goal is to perform basic operation over the strings until they become equal.
- A concept of "distance" between two strings can be defined according to the minimum cost of making them equal.
- In some cases it is useful to measure the degree of similarity rather than of dissimilarity (i.e., a distance). One example is the LCS, a heavily studeis measure. Other examples are the shortest common supersequence (SCS), longest common substring (LCG, different from LCS because the common string has to be a contiguous substring of both sequences), and shortest common superstring (SCG), as well as their version or more than two strings.