Quickselect

excerpt: The quickselect algorithm is a selection method. It finds the k-th smallest number. tags: three-way-comparison

The quickselect algorithm is a selection method. It finds the k-th smallest number (also called the k-th order statistic) in a unsorted list. The input consists of a numerical list a, lower and upper indices defining a subslist within it and a positive integer k. The algorithm will search for the k-th smallest number in the specified sublist.

Implementation

The size of the problem is the length of the sublist. The smallest instances of the problem correspond to lists that contain a single element. Therefore, the base case occurs when the lower and upper indices are identical, where the method can simply return the element of the list.

For larger lists partitioning scheme is used to arrange the list in three parts. A first sublist contains the elements of ‘a’ that are <= a chosen pivot element from the list. This list is followed by another one that only contains the pivot. Finally, a third list contains the elements of ‘a’ that are > the pivot.

Let ip denote the index of the pivot. Since the function that implements the partition scheme returns ip, we can check whether it is equal to k - 1 (since indices start at location 0, the j-th element appears at position j - 1). If it is, then the pivot will be the k-th smallest element in ‘a’ and the method can terminate at this base case. This scenario is illustrated in the figure.

In the recursive case the algorithm reduces the size of the problem by considering either the sublist to the left or to the right of the pivot. If ip > k - 1, then the location of the searched element will be smaller than ip. Thus the algorithm can focus on the sublist to the left of the pivot.

Instead, if ip < k - 1 then the algorithm will focus on the sublist to the right of the pivot.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def quick_select(a, lower, upper, k)
  if lower == upper
    return a[lower]
  end
  
  pivot_index = partition(a, lower, uppper)
  
  if pivot_index == k - 1
    return a[pivot_index]
  elsif pivot_index < k - 1
    return quick_select(a, pivot_index + 1, upper, k)
  else
    return quick_select(a, lower, pivot_index - 1, k)
  end
end

The runtime cost of the algorithm depends on where the pivot is located after performing the decomposition. If it is always located at the middle position of the list, the run time is linear. This is the best case scenario.

However, if the pivot is always located at an extreme of the list, then the runtime cost is quadratic. This is the worst case scenario for the algorithm.

Quickselect vs Quicksort

The quickselect and quicksort are different algorithms for sorting and selection:

  • Quicksort is a sorting algorithm. It recursively partitions an array, then sorts the left and right partitions. This allows it to efficiently sort the entire array.

  • Quickselect is a selection algorithm. It uses the same partitioning approach as quicksort, but only recurses into one partition. This allows it to efficiently find the kth smallest/largest element in an array without needing to fully sort it.

The main differences:

  • Quicksort fully sorts an array by recursively sorting the full left and right partitions. Quickselect only recurses on one side to isolate the kth element.

  • Quicksort has average O(n log n) time complexity for sorting. Quickselect has average O(n) time complexity for selection.

  • Quicksort is comparison based. Quickselect relies on key-preserving partitions.

  • Quicksort is unstable while quickselect is stable.

So in summary:

  • Quicksort fully sorts an array using partitioning and recursion
  • Quickselect partially sorts to find the kth element using one-sided recursion
  • They rely on the same partitioning approach but have different goals

They are related algorithms but have distinct uses - full sorting vs partial selection.