Greatest Common Divisor of Strings

To approach this problem, we first need to understand the underlying mathematical principle of greatest common divisor (gcd). Here, instead of numbers, we are dealing with strings. But the concept remains the same - we need to find the largest string that can divide both given strings.

The basic strategy is to find the smallest common divisor and iteratively check if it is also a divisor of the larger string.

Python solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Solution:
    def gcdOfStrings(self, str1: str, str2: str) -> str:
        if len(str1) < len(str2):  # always keep str1 >= str2
            return self.gcdOfStrings(str2, str1)

        if not str1.startswith(str2):  # if str1 doesn't start with str2, there is no common divisor
            return ''

        if len(str2) == 0:  # str2 is the gcd string
            return str1

        # Recursive call to find gcd string
        return self.gcdOfStrings(str1[len(str2):], str2)

This algorithm works by recursively reducing the problem to the gcd of the suffix of str1 (after removing str2 from the start of str1) and str2, until one string is empty. The non-empty string at that point is the gcd of strings.

The termination condition for this recursive process is when str1 doesn’t start with str2 or str2 becomes empty.

This algorithm follows the Euclidean algorithm for computing gcd of two numbers. The time complexity is O(N^2), where N is the length of the strings.

What is the base case?

The base case for this recursive solution is when the second string str2 becomes empty or when the string str1 doesn’t start with str2.

In other words, the base cases are:

  1. if not str1.startswith(str2): This checks if the larger string str1 starts with str2. If it does not, it means there’s no common divisor, and an empty string '' is returned.

  2. if len(str2) == 0: This checks if str2 has been completely subtracted from str1 and has become empty. If it is, it means that str1 is the largest string that can divide both given strings, so str1 is returned.

What is the unit of work done in each recursive call?

In each recursive call of the function, the unit of work done is checking if str1 starts with str2 and making a new recursive call with updated values for str1 and str2.

Specifically, if str1 starts with str2, str1 is updated by removing the starting part of str1 that is equal to str2, and str2 remains the same. Then, a recursive call is made with the updated str1 and str2.

If str1 does not start with str2, a new recursive call is made with str2 and the remainder of str1 after removing str2 from the start, in order to find the greatest common divisor of the two strings.

Thus, the unit of work in each recursive call consists of string comparison and manipulation, followed by a recursive function call.

It seems like we are looking for substring matches in the solution. Am I right?

Yes, you’re correct. Specifically, we’re looking for the largest substring that is a prefix of both str1 and str2 and that can completely divide both strings. That is, both str1 and str2 are composed of one or more repetitions of this substring. This is akin to finding the Greatest Common Divisor (GCD) of two numbers, except here we’re dealing with strings.

How do we reduce the size of the problem in each recursive call?

We reduce the size of the problem in each recursive call by considering the remainder of dividing the longer string by the shorter one. This is done by taking the substring of the longer string that starts from the end of a complete occurrence of the shorter string.

If str1 is longer than str2, we take the substring of str1 that starts from the end of a complete occurrence of str2 in str1 and recursively call the function with this new substring and str2. If str2 is longer than str1, we do the similar operation.

This approach is similar to the Euclidean algorithm for computing the greatest common divisor of two numbers. We continually subtract the smaller number from the larger one until we reach zero, at which point the remaining number is the GCD. In this string version, we’re continually subtracting the smaller string from the larger one by taking substrings until we reach an empty string, at which point the remaining string is the “greatest common divisor” of the original strings.