Bucket Sort

Bucket sort is a sorting algorithm that works by distributing elements into buckets representing subranges, then sorting each bucket individually.

The steps for bucket sort are:

  1. Set up an array of empty “buckets” - typically arrays or linked lists.

  2. Scatter: Go through the original array, putting each element in its corresponding bucket based on some hash function.

  3. Sort: Sort each non-empty bucket individually using a simple sorting algorithm like insertion sort.

  4. Gather: Visit the buckets in order and put all elements back into the original array.

For example, to bucket sort an array of integers from 0-100, we could create 10 buckets representing 10 subranges of 0-9, 10-19, …, 90-99. Each integer is hashed to the appropriate bucket, then each bucket is sorted individually.

The time complexity can be O(n) on average if the hash function evenly distributes elements and the buckets can be efficiently sorted. Bucket sort works well for sorting arrays with large ranges but small subranges.

Some key properties:

  • Typically fast and stable sort
  • Space overhead is O(n+k) for n elements and k buckets
  • Works well for evenly distributed keys
  • Often used as subroutine in radix sort

Overall, bucket sort is a useful distribution-based sorting algorithm when optimal buckets can be chosen based on the input.

Bucket sort is a sorting algorithm that works by distributing elements into buckets representing ranges, then sorting each bucket. It can achieve linear time complexity on average.

Java example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void bucketSort(float[] arr, int numBuckets) {

  @SuppressWarnings("unchecked")
  ArrayList<Float>[] buckets = new ArrayList[numBuckets];

  for (int i = 0; i < numBuckets; i++)
    buckets[i] = new ArrayList<>();

  // distribute elements into buckets  
  for (float elem: arr) {
    buckets[(int)Math.floor(elem * numBuckets)].add(elem);
  }

  // sort individual buckets
  for (List<Float> bucket: buckets) {
    Collections.sort(bucket);
  }

  // gather from buckets back to original array
  int index = 0;
  for (float elem: buckets) {
    for (float e: elem) {
      arr[index++] = e; 
    }
  }
}

C++ example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
void bucketSort(vector<float>& arr, int numBuckets) {
  
  vector<vector<float>> buckets(numBuckets);
  
  for(float elem: arr) {
    buckets[floor(elem * numBuckets)].push_back(elem);  
  }
  
  for(auto &bucket: buckets) {
    sort(bucket.begin(), bucket.end()); 
  }
  
  int index = 0;
  for(auto &bucket: buckets) {
    for(float elem: bucket) {
      arr[index++] = elem;
    }
  }
  
}

Python example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def bucket_sort(arr, num_buckets):

  buckets = [[] for _ in range(num_buckets)]

  for elem in arr:
    buckets[int(elem * num_buckets)].append(elem)
  
  for bucket in buckets:
    bucket.sort()

  index = 0
  for bucket in buckets:
    for elem in bucket:
      arr[index] = elem
      index += 1

The key steps are:

  1. Initialize buckets
  2. Distribute elements into buckets
  3. Sort each bucket
  4. Gather elements back to array

Bucket sort is useful when the input can be evenly divided into ranges for efficient sorting.

Bucket Sort is a sorting algorithm that distributes elements into various buckets, sorts each bucket individually, and then merges them to form a sorted array. It’s useful when input data are uniformly distributed across a range.

It assumes that the input is from a uniform distribution. It is fast because it assumes something about the input. It assumes that the input is generated by a random process that distributes elements uniformly and independently over the interval [0, 1).

Code Explanation

Java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import java.util.*;

public class BucketSort {
    public static void sort(float[] arr) {
        int n = arr.length;
        ArrayList<Float>[] buckets = new ArrayList[n];

        // Initialize buckets
        for (int i = 0; i < n; i++) {
            buckets[i] = new ArrayList<>();
        }

        // Add elements to buckets
        for (float num : arr) {
            int index = (int) (num * n);
            buckets[index].add(num);
        }

        // Sort individual buckets
        for (ArrayList<Float> bucket : buckets) {
            Collections.sort(bucket);
        }
        
        // Concatenate buckets
        int index = 0;
        for (ArrayList<Float> bucket : buckets) {
            for (float num : bucket) {
                arr[index++] = num;
            }
        }
    }
}

C++

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
#include <vector>
#include <algorithm>

void bucketSort(float arr[], int n) {
    std::vector<float> buckets[n];
    
    // Add elements to buckets
    for (int i = 0; i < n; i++) {
        int index = arr[i] * n;
        buckets[index].push_back(arr[i]);
    }
    
    // Sort individual buckets
    for (int i = 0; i < n; i++) {
        std::sort(buckets[i].begin(), buckets[i].end());
    }
    
    // Concatenate buckets
    int index = 0;
    for (int i = 0; i < n; i++) {
        for (float num : buckets[i]) {
            arr[index++] = num;
        }
    }
}

Python

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def bucket_sort(arr):
    n = len(arr)
    buckets = [[] for _ in range(n)]
    
    # Add elements to buckets
    for num in arr:
        index = int(num * n)
        buckets[index].append(num)
    
    # Sort individual buckets
    for bucket in buckets:
        bucket.sort()
        
    # Concatenate buckets
    index = 0
    for bucket in buckets:
        for num in bucket:
            arr[index] = num
            index += 1

In each code snippet, we first create an array of empty buckets. Each element is then placed into a bucket based on its value. Each bucket is sorted individually, and finally, the sorted elements from all the buckets are concatenated to get the sorted array.