Longest Repeating Substring

To solve the problem of finding the length of the longest repeating substring, we can use the binary search algorithm combined with the Rabin-Karp algorithm for string matching.

Here’s how the solution works:

  1. Binary Search for Length: Perform a binary search on the length of the substring we want to find (from 1 to the length of the string). For each candidate length L, we will use a rolling hash to check if there’s a repeating substring with length L.

  2. Rabin-Karp Algorithm for String Matching: The Rabin-Karp algorithm uses hashing to find substring matches. We will use it to find if there are repeating substrings with a given length L.

  3. Update Solution: If we find a repeating substring with length L, we will update our binary search range to look for longer substrings. If not, we will look for shorter substrings.

Here’s the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class Solution:
    def longestRepeatingSubstring(self, s: str) -> int:
        n = len(s)

        # Function to check if there's a repeating substring with length L
        def search(L):
            seen = set()
            base = 26
            cur_hash = 0
            for i in range(L):
                cur_hash = cur_hash * base + ord(s[i]) - ord('a')
            seen.add(cur_hash)

            # Rolling hash for remaining characters
            baseL = pow(base, L)
            for i in range(L, n):
                cur_hash = (cur_hash * base - (ord(s[i - L]) - ord('a')) * baseL + ord(s[i]) - ord('a')) % (10**9 + 7)
                if cur_hash in seen:
                    return True
                seen.add(cur_hash)
            return False

        # Binary search to find the longest repeating substring
        left, right = 1, n
        while left <= right:
            mid = (left + right) // 2
            if search(mid):
                left = mid + 1
            else:
                right = mid - 1

        return right

Key Takeaways

  • Binary Search: Using binary search to look for the optimal length helps reduce the time complexity.
  • Rabin-Karp Algorithm: The Rabin-Karp algorithm is an efficient way to look for substring matches.
  • Time Complexity: The overall time complexity of this solution is O(N * log N), where N is the length of the string.
  • Space Complexity: The space complexity is O(N), as we are storing the hashes of substrings.