Maximal Repeating Substring

From GM-RKB
Jump to navigation Jump to search

See: Repeating Substring, Maximal, Maximal Contiguous Subsequence.



References

2003

  • (Bray et al., 2003) ⇒ Nick Bray, Inna Dubchak, and Lior Pachter. (2003). “AVID: A Global Alignment Program.” In: Genome Research, 13.
    • Finding Matches Using Suffix -Trees: A maximal repeated substring in a string is a subsequence that has the property that every subsequence that contains it is not repeated in the string (Gusfield 1997). The problem of finding all maximal repeated substrings of a single string has a straightforward solution using suffix trees (Fig. 2). Maximal matches between two sequences are a pair of matching subsequences (one from each sequence) whose flanking bases are mismatches. In AVID, the problem of finding all maximal matches between two sequences is transformed to the problem of finding maximal repeated substrings in one string. This is done by concatenating the two sequences and placing the character N between them. A maximal repeat in this string that crosses the boundary between the two sequences represents a maximal match between the two sequences. ...
  • (Mukhrejee et al., 2003) ⇒ Sayan Mukherjee, Guizhen Yang, and I.V. Ramakrishnan. (2003). “Automatic Annotation of Content-rich HTML Documents: Structural and Semantic Analysis.” In: Proceedings of the 2nd International Semantic Web Conference (ISWC 2003).
    • QUOTE: Definition 2 (Maximal Repeating Substrings) Given a string S and a support threshold value [math]\displaystyle{ \theta }[/math], a substring that repeats k times in S is a maximal repeating substring if and only if: (i) [math]\displaystyle{ k \ge 2 }[/math] and j�j � k � � � jSj (jSj denotes the length of S); and (ii) j�j�k is the maximum among all substrings that satisfy condition (i); and (iii) k is the maximum among all substrings that satisfy conditions (i) and (ii). … Essentially the above definition says that a maximal repeating substring should be, first of all, a repeating substring that covers a majority of elements. ...

1998

  • http://www.cs.umd.edu/Outreach/hsContest98/questions/node1.html
    • Maximum Repeating Substring: You must find a substring of an input string, such that the substring consecutively repeats the most number of times.

      A string s is said to be a substring of string t if the characters of s appear consecutively in order within t. For example, BAT is a substring of ACROBATS since these characters occur in the same order as follows ACROBATS.

      A substring repeats consecutively if it appears multiple times in a string, with no gaps between its occurrences. For example, the string BLAAABADBADX" has two repeating substrings: 'A', which repeats three times, and 'BAD', which repeats twice. The answer reported is '3', since the length of the repeating substring is not considered. Only consecutive repetitions of a substring are counted, so the last two occurrences of 'A' don't count toward the total number of repetitions.

1997

  • (Gusfield, 1997) ⇒ D. Gusfield. (1997). “Algorithms on Strings, Trees, and Sequences: Computer science and computational biology." Cambridge University Press.