Selection Problem

Selection problems involve finding the kth smallest or largest element in a list. This has applications in statistics and visualization.

Some key algorithms are:

  • Quickselect - Use quicksort partitioning
  • Median of medians - Avoid bad pivot selection
  • Heap selection - Use max/min heap

Java - Quickselect:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
int quickselect(int[] arr, int k) {
  shuffle(arr); // Randomize pivot
  return select(arr, 0, arr.length - 1, k - 1);
}

int select(int[] arr, int lo, int hi, int k) {
  if (lo <= hi) {
    int p = partition(arr, lo, hi);
    if (p == k) {
      return arr[p];
    } else if (p < k) {
      return select(arr, p + 1, hi, k); 
    } else {
      return select(arr, lo, p - 1, k);
    }
  }
  return -1; // Invalid k
}

C++ - Heap selection:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
int heapselect(vector<int>& arr, int k) {
  make_heap(arr.begin(), arr.begin()+k);

  for (int i = k; i < arr.size(); i++) {
    if (arr[i] > arr[0]) {
      pop_heap(arr.begin(), arr.begin()+k);
      arr[k-1] = arr[i];
      push_heap(arr.begin(), arr.begin()+k);
    }
  }

  return arr[0]; // Kth element
}

Python - Median of medians:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import statistics

def select(lst, k):
  
  if len(lst) <= 5:
    return sorted(lst)[k]
  
  medians = []
  for i in range(0, len(lst), 5):
    medians.append(statistics.median(lst[i:i+5]))

  pivot = select(medians, len(medians) // 2)
  
  parts = partition(lst, pivot)
  
  if k < len(parts[0]):
    return select(parts[0], k)
  elif k < len(parts[0]) + len(parts[1]):
    return pivot
  else:
    return select(parts[2], k - (len(parts[0]) + len(parts[1])))

Selection provides rank-based statistics on datasets. Used for finding quantiles, medians, and outliers.

The selection problem is a computational task that involves finding the (k)-th smallest (or largest) element in an unordered collection. This element is known as the (k)-th order statistic. The selection problem can be solved using various algorithms, from simple ones like sorting the entire array and picking the (k)-th element to more efficient methods like QuickSelect.

Java Code for Selection Problem

In Java, the QuickSelect algorithm can solve the selection problem efficiently. Here’s a simplified example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public class QuickSelect {
    public static int quickSelect(int[] arr, int low, int high, int k) {
        int pivotIndex = partition(arr, low, high);

        if (pivotIndex == k) {
            return arr[pivotIndex];
        } else if (pivotIndex > k) {
            return quickSelect(arr, low, pivotIndex - 1, k);
        } else {
            return quickSelect(arr, pivotIndex + 1, high, k);
        }
    }

    public static int partition(int[] arr, int low, int high) {
        int pivot = arr[high];
        int i = low;

        for (int j = low; j < high; j++) {
            if (arr[j] < pivot) {
                int temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
                i++;
            }
        }

        arr[high] = arr[i];
        arr[i] = pivot;

        return i;
    }

    public static void main(String[] args) {
        int[] arr = {5, 3, 9, 1, 4};
        int k = 2;
        int result = quickSelect(arr, 0, arr.length - 1, k);
        System.out.println("The " + (k+1) + "th smallest element is: " + result);
    }
}
  1. Use quickSelect to find the (k)-th smallest element.
  2. Inside quickSelect, call partition to separate smaller and larger elements around a pivot.
  3. Recur on the correct sub-array based on the pivot index.

C++ Code for Selection Problem

In C++, QuickSelect can be implemented in a manner similar to Java:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream>
#include <algorithm>

int quickSelect(int arr[], int low, int high, int k);
int partition(int arr[], int low, int high);

int main() {
    int arr[] = {5, 3, 9, 1, 4};
    int k = 2;
    int result = quickSelect(arr, 0, 4, k);
    std::cout << "The " << k+1 << "th smallest element is: " << result << std::endl;
}

int quickSelect(int arr[], int low, int high, int k) {
    int pivotIndex = partition(arr, low, high);

    if (pivotIndex == k) {
        return arr[pivotIndex];
    } else if (pivotIndex > k) {
        return quickSelect(arr, low, pivotIndex - 1, k);
    } else {
        return quickSelect(arr, pivotIndex + 1, high, k);
    }
}

int partition(int arr[], int low, int high) {
    int pivot = arr[high];
    int i = low;

    for (int j = low; j < high; ++j) {
        if (arr[j] < pivot) {
            std::swap(arr[i], arr[j]);
            ++i;
        }
    }

    std::swap(arr[i], arr[high]);
    return i;
}
  1. The quickSelect function is similar to Java’s, using a partition function to separate elements.
  2. Recur on the correct partition.

Python Code for Selection Problem

In Python, QuickSelect can be implemented concisely:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def quick_select(arr, low, high, k):
    pivot_index = partition(arr, low, high)
    
    if pivot_index == k:
        return arr[pivot_index]
    elif pivot_index > k:
        return quick_select(arr, low, pivot_index - 1, k)
    else:
        return quick_select(arr, pivot_index + 1, high, k)

def partition(arr, low, high):
    pivot = arr[high]
    i = low
    
    for j in range(low, high):
        if arr[j] < pivot:
            arr[i], arr[j] = arr[j], arr[i]
            i += 1
    
    arr[i], arr[high] = arr[high], arr[i]
    return i

if __name__ == "__main__":
    arr = [5, 3, 9, 1, 4]
    k = 2
    result = quick_select(arr, 0, len(arr) - 1, k)
    print(f"The {k+1}th smallest element is: {result}")
  1. Use quick_select to find the (k)-th smallest element.
  2. Use partition to partition the array around a pivot.
  3. Recur as necessary.

The QuickSelect algorithm implemented in Java, C++, and Python provides an efficient way to solve the selection problem. It leverages partitioning and recursion to find the (k)-th smallest element without sorting the entire array.