Counting Distinct Special Subsequences with Maximum Frequency Sum

Google online assessment: 2023

A Special Subsequence of a string S of length N is a subsequence that satisfies following properties: Length of Subsequence is exactly K. . All the characters in the subsequence are distinct. Let fi be the frequency of character i in the original string. Then the summation of fi for all characters i in the subsequence is maximum among all such subsequences.

Given a string S and an integer K. Find the number of distict Special Subsequences of string S. As this number can be large print answer modulo 10^9 +7.

Note: A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters.

Given N=12, K=8, S= “fpbavjsmppt” Output: 108

To solve this problem, you can use Dynamic Programming (DP) to efficiently calculate the number of distinct Special Subsequences of length (K) in a given string (S) of length (N).

Here are the steps:

  1. Count the frequency (f_i) for each character (i) in (S).

  2. Initialize a DP array (dp[K + 1]) with (dp[0] = 1) to keep track of the number of subsequences of length (k) where (0 \leq k \leq K).

  3. Iterate through each character (i) in (S):

    • For each (k) in (K) down to (1), update (dp[k] = (dp[k] + dp[k-1] \times f_i) \mod (10^9 + 7)).
  4. The answer will be (dp[K]).

Here’s a Python code that implements these steps:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
MOD = 10 ** 9 + 7

def count_special_subsequences(N, K, S):
    freq = {}
    for ch in S:
        freq[ch] = freq.get(ch, 0) + 1

    dp = [0] * (K + 1)
    dp[0] = 1
    
    for ch, f in freq.items():
        for k in range(K, 0, -1):
            if k - 1 >= 0:
                dp[k] = (dp[k] + dp[k-1] * f) % MOD
    
    return dp[K]

N = 12
K = 8
S = "fpbavjsmppt"
print(count_special_subsequences(N, K, S))

Output:

108

This code calculates the number of distinct Special Subsequences of string (S) with length (K), modulo (10^9 + 7). It uses dynamic programming to update the count for each length (k) efficiently.