[toc]

## preface

Statement: refer to the Internet, and leave a message if there is any dispute. Standing on the shoulders of our predecessors, we can see further.

This tutorial is hand-made, dedicated to the most practical tutorial, no reward, just hope to forward more support.

Welcome to my official account, I hope I can get to know you, and I can also urge you to search more. WeChat search: JavaPub

You can talk about any problem!

If you read the last one Count sort , do you have any doubt that when the span between each data is too large (such as sorting 20 numbers from 0-200 million numbers), a lot of space consumption is required. Bucket sorting is right Count sort Improvement.

## 1. Bucket sort

Baidu Encyclopedia:

Bucket sort, or so-called box sort, is a sort algorithm. Its working principle is to divide the array into a limited number of buckets. Each bucket will be sorted individually (it is possible to use another sorting algorithm or continue to use bucket sorting recursively). Bucket order is Pigeonhole Sort An inductive result of. When the values in the array to be sorted are evenly distributed, bucket sorting uses linear time (Θ (n)). However, bucket sorting is not comparative sorting, and it is not affected by the lower O(n log n) limit.

Continue -- >

Bucket order is Count sort Upgrade version of. It makes use of the mapping relation of the function. The key to its efficiency lies in the determination of the mapping function. In order to make bucket sorting more efficient, we need to do these two things:

- Increase the number of barrels as much as possible with sufficient additional space
- The mapping function can evenly distribute the input N data into K buckets

At the same time, for the sorting of elements in buckets, it is very important to choose which sort algorithm to compare.

Bucket sorting is to store the elements in the same value field in the set to be sorted into the same bucket. That is to say, the set is divided into multiple regions according to the characteristics of element values. Then, the multiple buckets formed after splitting are in an orderly state from the value field. If the elements in each bucket are sorted, the set of elements in all buckets is sorted.

Quick sorting is to divide the set into two value fields, which are called two buckets, and then sort the two buckets respectively, and finally complete the sorting. Bucket sorting is to split the collection into multiple buckets, and sort each bucket to complete the sorting process. The difference between the two is that fast sorting is to sort on the set itself, which belongs to the in-situ sorting method, and the sorting method for each bucket is also fast sorting. Bucket sorting provides additional operation space, in which the bucket can be sorted, avoiding the element comparison and exchange operations that constitute the bucket process. At the same time, the proper sorting algorithm can be selected independently to sort the buckets.

## 2. Principle

### 2.1. Key

- The division of element value field is the mapping rule from element to bucket. The mapping rules need to be selected according to the element distribution characteristics of the set to be sorted. If the rules are designed too fuzzy and broad, it may lead to all elements in the set to be sorted mapped to a bucket, and then the bucket sorting will evolve to the comparative sorting algorithm. If the mapping rules are too specific and strict, it may lead to the mapping of each element value in the set to be sorted to a bucket, then the bucket sorting will evolve to count sorting.
- The selection of sorting algorithm, the process of mapping elements in the set to be sorted to each bucket, does not exist element comparison and exchange operation. When sorting elements in each bucket, you can choose the appropriate sorting algorithm independently. The complexity and stability of the bucket sorting algorithm are different according to the different sorting algorithms.

### 2.2. Algorithm process

- According to the difference range and mapping rules of the maximum and minimum elements in the set to be sorted, the number of buckets applied for is determined;
- Traverse the set to be sorted, and move each element to the corresponding bucket;
- Sorts the elements in each bucket and moves them to the sorted collection.

The sorted set mentioned in step 3 is the same set as the one to be sorted in steps 1 and 2. Different from counting sorting, after step 2 of bucket sorting is completed, all elements are in the bucket, and after sorting the elements in the bucket, the original collection is no longer relied on in the process of moving elements, so you can move the elements in the bucket back to the original collection.

- Sketch Map

Elements are assigned to different buckets:

Then, the elements are sorted in each bucket:

## 3. Code

Based on Java code, the code logic is well understood and used to Insert sort , if you don't understand, click send.

`package utils; import java.util.Arrays; /** * @author wangshiyu rodert * @date 2020/6/21 15:13 * @description */ public class BucketSort { public static void main(String[] args) throws Exception { int[] array = {2, 1, 5, 3, 4}; BucketSort bucketSort = new BucketSort(); int[] sort = bucketSort.sort(array); System.out.println(Arrays.toString(sort)); } private static final InsertSort insertSort = new InsertSort(); public int[] sort(int[] sourceArray) throws Exception { // Copy the arr without changing the parameter content int[] arr = Arrays.copyOf(sourceArray, sourceArray.length); return bucketSort(arr, 5); } private int[] bucketSort(int[] arr, int bucketSize) throws Exception { if (arr.length == 0) { return arr; } int minValue = arr[0]; int maxValue = arr[0]; for (int value : arr) { if (value < minValue) { minValue = value; } else if (value > maxValue) { maxValue = value; } } int bucketCount = (int) Math.floor((maxValue - minValue) / bucketSize) + 1;//Round down + 1 int[][] buckets = new int[bucketCount][0]; // Using mapping function to allocate data to each bucket for (int i = 0; i < arr.length; i++) { int index = (int) Math.floor((arr[i] - minValue) / bucketSize); buckets[index] = arrAppend(buckets[index], arr[i]); } int arrIndex = 0; for (int[] bucket : buckets) { if (bucket.length <= 0) { continue; } // Sort each bucket using insert sort bucket = insertSort.sort(bucket); for (int value : bucket) { arr[arrIndex++] = value; } } return arr; } /** * Automatically expand capacity and save data * * @param arr * @param value */ private int[] arrAppend(int[] arr, int value) { arr = Arrays.copyOf(arr, arr.length + 1); arr[arr.length - 1] = value; return arr; } } class InsertSort { //Insert sort public int[] sort(int[] sourceArray) throws Exception { // Copy the arr without changing the parameter content int[] arr = Arrays.copyOf(sourceArray, sourceArray.length); // Select the appropriate location to insert from the element with subscript 1, because there is only one element with subscript 0, which is ordered by default for (int i = 1; i < arr.length; i++) { // Record data to insert int tmp = arr[i]; // Compare from the rightmost of the sorted series to find the smaller number int j = i; while (j > 0 && tmp < arr[j - 1]) { arr[j] = arr[j - 1]; j--; } // There are smaller numbers, inserting if (j != i) { arr[j] = tmp; } } return arr; } }`

Return result:

[1, 2, 3, 4, 5]

Arrays.copyOf() method understanding: used to copy the specified array contents to achieve the purpose of expansion. This method has corresponding overload methods for different basic data types.

## 4. Extended reading

True topic: 347. Top k frequency elements (medium), given a non empty integer array, returns the elements with the highest K before the frequency.

Given a non-empty array of integers, return the k most frequent elements.

- Explanation:

`//Solving "top K high frequency elements" based on bucket sorting class Solution { public List<Integer> topKFrequent(int[] nums, int k) { List<Integer> res = new ArrayList(); // Use the dictionary to count the number of occurrences of each element. The element is the key and the number of occurrences of the element is the value HashMap<Integer,Integer> map = new HashMap(); for(int num : nums){ if (map.containsKey(num)) { map.put(num, map.get(num) + 1); } else { map.put(num, 1); } } //Bucket sort //The frequency is used as the array index, and the corresponding array index is stored for the number sets with different frequency List<Integer>[] list = new List[nums.length+1]; for(int key : map.keySet()){ // Get the number of occurrences as a subscript int i = map.get(key); if(list[i] == null){ list[i] = new ArrayList(); } list[i].add(key); } // Traversing array in reverse order to get the order from large to small for(int i = list.length - 1;i >= 0 && res.size() < k;i--){ if(list[i] == null) continue; res.addAll(list[i]); } return res; } }`