Subsets With Duplicate Characters

tags: reducing-the-domains-of-the-variables set sort choose-explore-unchoose dfs

Given a string that might contain duplicate characters, find all the possible distinct subsets of that string.

Solution with TLE Problem

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def get_distinct_subsets(str)
  output = []
  subset = []
  index = 0
  dfs(str, index, subset, output)
  
  return output.sort
end

def dfs(str, index, subset, output)
  result = subset.dup
  answer = result.sort.join
  
  unless output.include?(answer)
    output << answer
  end
  
  for i in (index..str.length-1)
    if i > index && str[i] == str[i-1]
      next
    end
    
    subset << str[i]
    dfs(str, i+1, subset, output)
    subset.pop
  end

  output
end

This solution leads to TLE issue because we are using sort and include? on the candidate solution.

Working Solution

The TLE problem can be overcome by using a set.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
require 'set'

def get_distinct_subsets(str)
  output = Set.new
  subset = []
  index = 0
  dfs(str, index, subset, output)
  
  return output.to_a
end


def dfs(str, index, subset, output)
  output << subset.sort.join
  
  for i in (index..str.length-1)
    if i > index && str[i] == str[i-1]
      next
    end
    
    subset << str[i]
    dfs(str, i+1, subset, output)
    subset.pop
  end

  output
end