Kth Largest Element in an Array

title: Kth Largest Element in an Array tags: sort heap

Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

Example 1:
Input: [3,2,1,5,6,4] and k = 2
Output: 5
Example 2:
Input: [3,2,3,1,2,4,5,5,6] and k = 4
Output: 4

Note: You may assume k is always valid, 1 ≤ k ≤ array’s length.

If the input is sorted:

1,2,3,4,5,6

We can access the element in O(1) time. O(1) time to access the kth largest element if we store it in min heap.

Solution Outline

  1. Create a min heap of size k (k is the invariant)
  2. Traverse through the input list and add the elements which are greater than the min element to the heap
  3. Extract the min element from the heap and return as the result.

[3,2,1,5,6,4]

  • Create a max heap of size = k
  • Traverse through the nums array
    • Add first k elements into the heap
3
2
  • 1 < 3 skip the element
  • 5 > 3

We need a min heap so the minimum value gets kicked out to make room for the larger number.

  • Create a min heap of size = k

  • Traverse through the nums array

  • Add first k elements into the heap

2
3
  • 1 < 3 skip the element 1
  • 5 > 3

Add 5 to the heap

3
5
  • 6 > 5

Add 6 to the heap

5
6
  • 4 < 6 skip the element 4

Extract the first element in the heap. This is the kth largest element.

1
2
3
4
5
def find_kth_largest(nums, k)
  nums.sort!
  
  nums[-k]
end
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def find_kth_largest(nums, k)
  min_heap = [nums[0]]
  
  nums[1..-1].each do |n|
    i = min_heap.bsearch_index { |y| y >= n } || min_heap.size
    min_heap.insert(i, n)
    
    # Heap size cannot exceed k 
    if min_heap.size > k
      min_heap.shift
    end
  end
  
  min_heap.first
end

Tracing of the program:

n : 2
i: 0
Before insert: [3]
After insert: [2, 3]
Maintain k elements: [2, 3]
n : 1
i: 0
Before insert: [2, 3]
After insert: [1, 2, 3]
Maintain k elements: [2, 3]
n : 5
i: 2
Before insert: [2, 3]
After insert: [2, 3, 5]
Maintain k elements: [3, 5]
n : 6
i: 2
Before insert: [3, 5]
After insert: [3, 5, 6]
Maintain k elements: [5, 6]
n : 4
i: 0
Before insert: [5, 6]
After insert: [4, 5, 6]
Maintain k elements: [5, 6]
5

Define the Interface 

Input: nums (array of integers), k (integer) Output: integer


Analyze the Input and Output 
 
- The input is not sorted.
- The value of k must be between [1, nums.size]
- Duplicate numbers can be in the input

Identify the constraints

1. 1 <= k <= nums.length <= 10^4
2. -10^4 <= nums[i] <= 10^4

Search the Problem Space

- kth largest element => Heap

Solve the Problem by Hand

nums = [3,2,1,5,6,4], k = 2

If the input is in ascending order: [1,2,3,4,5,6], grab the kth largest element by looking from the end of the array.

Time: O(n log n) Space: O(1) This WRONG


Sorting algorithms can use extra space. O(N).

Analyze Time and Space Complexity

Sorting and accessing the kth element from the array.

Solution Outline

- Max heap

We will keep the largest at the top. You can get the largest number in O(1) time.

6 5

   
- Min heap

5 6


[3,2,1,5,6,4]

- Create a max heap of size = k
- Traverse through the nums array
  	- Add first k elements into the heap

3 2


- 1 < 3 skip the element
- 5 > 3 
   
We need a min heap so the minimum value gets kicked out to make room for the larger number.
 
- Create a min heap of size = k
- Traverse through the nums array
  	- Add first k elements into the heap

2 3

 
- 1 < 3 skip the element 1
- 5 > 3 

Add 5 to the heap

3 5

     
- 6 > 5 

Add 6 to the heap

5 6

     
- 4 < 6 skip the element 4 

We will create a heap of size = k.

T: O(n log k) S: O(k)


```ruby
# @param {Integer[]} nums
# @param {Integer} k
# @return {Integer}
def find_kth_largest(nums, k)
  min_heap = [nums[0]]
  
  nums[1..-1].each do |n|
    i = min_heap.bsearch_index{|y| y >= n } || min_heap.size
    min_heap.insert(i, n)
      
    # heap size is always = k  
    if min_heap.size > k
      min_heap.shift
    end 
  end
  
  min_heap.first  
end

Define the Terms

Define the Interface Input: nums (array of integers), k (integer) Output: integer

Analyze the Input and Output

The input is not sorted. The value of k must be between [1, nums.size] Duplicate numbers can be in the input

Identify the Invariants

Identify the constraints

  1. 1 <= k <= nums.length <= 104
  2. -104 <= nums[i] <= 104

Search the Problem Space

  • kth largest element => Heap

Classify the Problem

Analyze the Examples

Solve the Problem by Hand

nums = [3,2,1,5,6,4], k = 2

Input in ascending order: [1,2,3,4,5,6] Grab the kth largest element by looking from the end of the array.

Time: O(n log n) Space: O(1) This WRONG

Sorting algorithm can use extra space. O(N).

Describe the Pattern

Distill the Observations to Create Insights

Brute Force Approach

Analyze Time and Space Complexity

Sorting and accessing the kth element from the array.

Outline the Solution

  • Max heap We will keep the largest at the top. You can get largest number in O(1) time.

    6 5

  • Min heap

    5 6

[3,2,1,5,6,4]

Create a max heap of size = k Traverse through the nums array

  • Add first k elements into the heap

3 2

  • 1 < 3 skip the element
  • 5 > 3

We need a min heap so the minimum value gets kicked out to make room for the larger number.

Create a min heap of size = k Traverse through the nums array

  • Add first k elements into the heap

2 3

  • 1 < 3 skip the element 1

  • 5 > 3 Add 5 to the heap

    3 5

- 6 > 5 
  Add 6 to the heap
 
 5
 6
 
- 4 < 6 skip the element 4 

We will create a heap of size = k.

Key Takeaways

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# @param {Integer[]} nums
# @param {Integer} k
# @return {Integer}
def find_kth_largest(nums, k)
  min_heap = [nums[0]]
  
  nums[1..-1].each do |n|
    i = min_heap.bsearch_index{|y| y >= n } || min_heap.size
    min_heap.insert(i, n)

#   heap size is always = k  
    if min_heap.size > k
       min_heap.shift
    end 
  end
  
  min_heap.first  
end

To find the kth largest element without sorting the array, you can utilize the heap data structure. A min-heap will be perfect for this problem.

Here’s the idea:

  1. Create a min-heap and insert the first k elements from the array into the heap.
  2. Iterate through the remaining elements in the array. For each element, compare it with the smallest element in the heap (the root of the min-heap).
  3. If the current element is larger than the root of the heap, pop the root from the heap and insert the current element.
  4. The root of the heap will always contain the kth largest element since we are maintaining only k elements in the heap at any time.

Here’s a Python code snippet to illustrate this solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from heapq import heappush, heappop

class Solution:
    def findKthLargest(self, nums: List[int], k: int) -> int:
        min_heap = []
        for i in range(k):
            heappush(min_heap, nums[i])
        
        for i in range(k, len(nums)):
            if nums[i] > min_heap[0]:
                heappop(min_heap)
                heappush(min_heap, nums[i])

        return heappop(min_heap)

This code snippet uses a min-heap to store the largest k elements seen so far. By the end of the loop, the heap will contain the k largest elements, and the root will contain the kth largest element, which is the result.

Key takeaways:

  • Min-heap is used to keep track of the k largest elements.
  • The complexity of this solution is O(n log k) where n is the length of the input array, and k is the value provided as input. It avoids sorting the entire array, which would be O(n log n).

Identifying Problem Isomorphism

“Find K Pairs with Smallest Sums” is an approximate isomorphism for “Kth Largest Element in an Array”. Both involve dealing with a ‘k’ parameter and require knowledge of sorting and heap data structures.

In “Kth Largest Element in an Array”, you need to find the kth largest element in an unsorted array. This might involve sorting the array or using a heap to efficiently find the kth largest element.

In “Find K Pairs with Smallest Sums”, you are given two integer arrays nums1 and nums2, and you are supposed to find the k pairs (u, v), where u is in nums1 and v is in nums2, that they have the smallest sums. This problem is more complex than the “Kth Largest Element in an Array” as you need to consider pairs of elements, not individual ones, and work with two separate arrays. It’s also likely to involve using a min-heap to keep track of the pairs with the smallest sums.

“Kth Largest Element in an Array” is simpler because you only need to focus on finding the kth largest element in one array, while “Find K Pairs with Smallest Sums” involves dealing with two arrays and finding pairs, which increases complexity.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
    def findKthLargest(self, nums: List[int], k: int) -> int:
        if not nums or not k or k < 0:
            return None
        maxheap, res = [], None
        for num in nums:
            heapq.heappush(maxheap, -num)

        for _ in range(k):
            res = -heapq.heappop(maxheap)
        return res

Problem Classification

The problem can be classified as follows:

  1. Sorting/Ranking: The problem is about finding the kth largest element, which essentially involves sorting or ranking elements in a particular order (descending in this case). The term “kth largest” implies an ordering of the elements.

  2. Array Manipulation: The given data structure in this problem is an array. Manipulation and analysis of this array is required to solve the problem.

  3. Selection: The task involves selecting a specific element (the kth largest one) from the array, which can be classified as a selection problem.

  4. Parameterized Complexity: The complexity of the problem depends on an external parameter (k), apart from the size of the input array.

  5. Search: This problem involves searching for a particular element in the array (the kth largest), which fits into the search problem category.

The actual solution may involve different algorithmic techniques like QuickSelect (variation of QuickSort), heap data structure, or sorting, but these are implementation details.

Language Agnostic Coding Drills

  1. Conditional Checks: Checking for conditions to validate inputs or variables. For example, checking if a list is empty or if a variable meets a certain criteria.

  2. List Operations: Creating lists and adding elements to lists.

  3. Loops: Iterating over a list using a for loop.

  4. Heap Operations: Using a heap data structure, adding elements to a heap, and retrieving elements from a heap. In Python, the heapq module is often used for this purpose.

  5. Negation Operation: Using negative numbers to switch between min heap and max heap in Python’s heapq module which supports min heap natively.

  6. Returning Values from Functions: Returning a result from a function.

Organized by increasing level of difficulty, they can be presented as follows:

  1. Conditional Checks: Understand how to use conditional checks to validate input data or to control program flow.
  2. List Operations: Learn to create lists, add elements to lists, and access list elements.
  3. Loops: Master the usage of for loops to iterate over a list.
  4. Returning Values from Functions: Learn how to return a result from a function.
  5. Negation Operation: Understand the usage of negation operation in the context of transforming max heap to min heap.
  6. Heap Operations: Understand the concept of heap, and learn how to use heap operations like push and pop.

The concepts learned from these smaller units can then be combined to solve the larger problem presented in the code: finding the kth largest element in a list.

Targeted Drills in Python

  1. Conditional Checks

    Goal: Write a function that checks if a list is empty or not. The function should return True if the list is empty and False otherwise.

    1
    2
    3
    4
    5
    
    def is_empty(lst):
        # your code here
    
    print(is_empty([]))  # should return True
    print(is_empty([1, 2, 3]))  # should return False
    
  2. List Operations

    Goal: Create a list and add elements to it. Print the list before and after adding elements.

    1
    2
    
    my_list = []
    # your code here to add elements to the list
    
  3. Loops

    Goal: Write a function that accepts a list of numbers and prints each number in the list.

    1
    2
    3
    4
    
    def print_numbers(lst):
        # your code here
    
    print_numbers([1, 2, 3, 4, 5])  # should print each number
    
  4. Returning Values from Functions

    Goal: Write a function that returns the square of a number.

    1
    2
    3
    4
    
    def square(n):
        # your code here
    
    print(square(5))  # should print 25
    
  5. Negation Operation

    Goal: Write a function that accepts a list of numbers and returns a new list with each number negated.

    1
    2
    3
    4
    
    def negate_list(lst):
        # your code here
    
    print(negate_list([1, 2, 3]))  # should print [-1, -2, -3]
    
  6. Heap Operations

    Goal: Create a min heap using heapq module in Python, add some numbers to it, and print the smallest number without popping it.

    1
    2
    3
    4
    
    import heapq
    
    heap = []
    # your code here to add numbers and print the smallest one
    

After practicing each of these drills, you should have a deeper understanding of the various concepts embedded within the code sample. Each of these drills isolates a particular concept, allowing you to focus on understanding and mastering it. Once you are comfortable with these concepts, you can then combine them as demonstrated in the initial code sample.