Binary Search Template

Suppose we have a search space. It could be an array, a range, etc. Usually it’s sorted in ascending order. For most tasks, we can transform the requirement into the following generalized form:

Minimize k , s.t. condition(k) is True

The following code is the most generalized binary search template:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def binary_search(array) -> int:
    def condition(value) -> bool:
        pass

    left, right = min(search_space), max(search_space) # could be [0, n], [1, n] etc. Depends on problem
    while left < right:
        mid = left + (right - left) // 2
        if condition(mid):
            right = mid
        else:
            left = mid + 1
    return left

What’s really nice is that, for most of the binary search problems, we only need to modify three parts after copy-pasting this template, and never need to worry about corner cases and bugs in code any more:

Correctly initialize the boundary variables left and right to specify search space. Only one rule: set up the boundary to include all possible elements; Decide return value. Is it return left or return left - 1? Remember this: after exiting the while loop, left is the minimal k​ satisfying the condition function; Design the condition function. This is the most difficult and most beautiful part. Needs lots of practice.

Explanation

The generalized binary search function described above is a template for binary search problems, which usually involve finding a particular value in a sorted sequence of values (i.e., the search space). The function works by repeatedly dividing the search space in half, hence the name “binary” search.

The three components you mentioned – initializing the boundaries, deciding the return value, and designing the condition function – are the keys to adapting this template to specific problems. Here’s a more detailed explanation of each component:

  1. Initializing the boundaries: The “left” and “right” variables determine the range of the search space. They need to be initialized in such a way that all possible solutions to the problem are included in the range. For instance, if you’re searching for a specific value in an array, “left” would be 0 (the index of the first element) and “right” would be the length of the array minus 1 (the index of the last element).

  2. Deciding the return value: The return value of the function is the result of the binary search. In the case of finding a specific value in an array, the function would return the index of that value. The condition used to exit the while loop determines what “left” represents at the end of the function. Typically, if the loop ends when “left” is less than “right”, “left” is the smallest value that meets the condition, and you would return “left”. If the loop ends when “left” is less than or equal to “right”, “left” might be one place to the right of the target value, and you would return “left - 1”.

  3. Designing the condition function: The condition function is used to decide whether to move “left” or “right” at each step of the binary search. Designing this function is often the trickiest part because it requires a deep understanding of the problem. In general, the condition function should return true for values that are “to the right of” the target value and false for values “to the left of” the target value. The goal is to gradually move “left” and “right” closer together until they converge at the target value.

In conclusion, the generalized binary search template provides a structured approach to solving binary search problems. However, it requires a clear understanding of the problem to correctly initialize the boundaries, decide the return value, and design the condition function. Practice with different binary search problems will help you become more comfortable with these steps.

Basic Usage

  1. First Bad Version [Easy]

You are a product manager and currently leading a team to develop a new product. Since each version is developed based on the previous version, all the versions after a bad version are also bad. Suppose you have n versions [1, 2, …, n] and you want to find out the first bad one, which causes all the following ones to be bad. You are given an API bool isBadVersion(version) which will return whether version is bad.

Example:

Given n = 5, and version = 4 is the first bad version.

call isBadVersion(3) -> false call isBadVersion(5) -> true call isBadVersion(4) -> true

Then 4 is the first bad version. First, we initialize left = 1 and right = n to include all possible values. Then we notice that we don’t even need to design the condition function. It’s already given by the isBadVersion API. Finding the first bad version is equivalent to finding the minimal k satisfying isBadVersion(k) is True. Our template can fit in very nicely:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Solution:
    def firstBadVersion(self, n) -> int:
        left, right = 1, n
        while left < right:
            mid = left + (right - left) // 2
            if isBadVersion(mid):
                right = mid
            else:
                left = mid + 1
        return left

The problem you’ve given is a classic application of binary search. We are given a range from 1 to n and we are to find the first version that is bad. We can consider this “being bad” as our condition for binary search.

Here’s the breakdown:

  1. We start by initializing the left to 1 and the right to n, covering the entire range of versions.

  2. The function isBadVersion(mid) is our condition function in this context. It’s given to us and it returns True if the version is bad and False otherwise. Our task is to find the smallest version number (k) that is bad. So, we can directly use this function in our template.

  3. During each iteration of the while loop, we calculate the middle version number (mid). We then check if this version is bad.

    • If isBadVersion(mid) is True, it means the mid and all versions after it are bad. But we need to find the first bad version, so our search space becomes all versions from left to mid, hence we update right = mid.

    • If isBadVersion(mid) is False, it means that all versions from left to mid are good. So, our search space becomes all versions from mid+1 to right, hence we update left = mid + 1.

  4. This process is repeated until left and right converge (i.e., left == right). At this point, both left and right point to the smallest version number that is bad, which is the solution to our problem.

  5. We return left (or right, since left == right).

This is a typical application of binary search where we’re trying to find the boundary (or “pivot”) between two types of elements (in this case, good and bad versions) in a sorted sequence.

  1. Sqrt(x) [Easy] Implement int sqrt(int x). Compute and return the square root of x, where x is guaranteed to be a non-negative integer. Since the return type is an integer, the decimal digits are truncated and only the integer part of the result is returned.

Example:

Input: 4 Output: 2 Input: 8 Output: 2 Easy one. First we need to search for minimal k satisfying condition k^2 > x, then k - 1 is the answer to the question. We can easily come up with the solution. Notice that I set right = x + 1 instead of right = x to deal with special input cases like x = 0 and x = 1.

1
2
3
4
5
6
7
8
9
def mySqrt(x: int) -> int:
    left, right = 0, x + 1
    while left < right:
        mid = left + (right - left) // 2
        if mid * mid > x:
            right = mid
        else:
            left = mid + 1
    return left - 1  # `left` is the minimum k value, `k - 1` is the answer

This problem involves finding the integer square root of a given non-negative number ‘x’. It asks us to find the largest integer ‘k’ such that k^2 is less than or equal to ‘x’.

Since the possible values of ‘k’ range from 0 to ‘x’, we can use binary search to efficiently find this ‘k’. This is why we initialize the left and right boundaries to 0 and x+1 respectively. The +1 for the right boundary is a small trick to include x into our search space.

Here’s how the code works:

  1. The while loop runs until ’left’ and ‘right’ cross each other. In each iteration of the loop, we calculate ‘mid’ as the average of ’left’ and ‘right’.

  2. We then check if ‘mid’ squared is greater than ‘x’. This is our condition function in this context.

    • If ‘mid’ squared is greater than ‘x’, then ‘mid’ can’t be our answer because we are looking for ‘k’ such that k^2 <= x. So, we update ‘right’ to ‘mid’ to search in the left half.

    • If ‘mid’ squared is not greater than ‘x’, then ‘mid’ could be our answer. But we can’t be sure until we have checked the numbers greater than ‘mid’. So, we update ’left’ to ‘mid + 1’ to search in the right half.

  3. When ’left’ and ‘right’ cross each other, ’left’ points to the smallest ‘k’ such that k^2 > x. Therefore, our answer is ’left - 1’, which is the largest ‘k’ such that k^2 <= x.

  4. We then return ’left - 1’ as our answer.

This solution takes advantage of the fact that for any ‘k’, if k^2 <= x, then (k+1)^2 > x. This property allows us to use binary search to quickly find ‘k’.

  1. Search Insert Position [Easy] Given a sorted array and a target value, return the index if the target is found. If not, return the index where it would be if it were inserted in order. You may assume no duplicates in the array.

Example:

Input: [1,3,5,6], 5 Output: 2 Input: [1,3,5,6], 2 Output: 1 Very classic application of binary search. We are looking for the minimal k value satisfying nums[k] >= target, and we can just copy-paste our template. Notice that our solution is correct regardless of whether the input array nums has duplicates. Also notice that the input target might be larger than all elements in nums and therefore needs to placed at the end of the array. That’s why we should initialize right = len(nums) instead of right = len(nums) - 1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Solution:
    def searchInsert(self, nums: List[int], target: int) -> int:
        left, right = 0, len(nums)
        while left < right:
            mid = left + (right - left) // 2
            if nums[mid] >= target:
                right = mid
            else:
                left = mid + 1
        return left

This problem is about searching for a target value in a sorted array. If the target is found, return the index; if not, return the index where it would be if it were inserted in order.

Here is a detailed explanation of how the binary search solution works:

  1. We first initialize the search space to be the entire array by setting left to 0 and right to the length of the array. Notice that we use len(nums) instead of len(nums) - 1 because the target could potentially be placed at the end of the array, making the array length one element longer.

  2. The while loop continues as long as left is less than right, indicating there are still elements to consider.

  3. Within the loop, we compute the middle index mid and then compare the middle value nums[mid] with the target.

    • If nums[mid] is greater than or equal to the target, we know that the target must be in the left half of the search space (including the mid-point itself, in case nums[mid] equals the target). So, we update right to mid to reduce our search space to the left half.

    • If nums[mid] is less than the target, we know that the target must be in the right half of the search space (not including the mid-point, since nums[mid] is less than the target). So, we update left to mid + 1 to reduce our search space to the right half.

  4. When the loop ends, left will be equal to right, and they will point to the place where the target can be inserted to maintain the sorted order of the array. So, we return left as the answer.

This algorithm is efficient because it halves the search space in each iteration of the loop, giving a time complexity of O(log n), where n is the length of the input array.

Advanced Usage

The above problems are quite easy to solve, because they already give us the array to be searched. We’d know that we should use binary search to solve them at first glance. However, more often are the situations where the search space and search target are not so readily available. Sometimes we won’t even realize that the problem should be solved with binary search – we might just turn to dynamic programming or DFS and get stuck for a very long time.

As for the question “When can we use binary search?”, my answer is that, If we can discover some kind of monotonicity, for example, if condition(k) is True then condition(k + 1) is True, then we can consider binary search.

  1. Capacity To Ship Packages Within D Days [Medium] A conveyor belt has packages that must be shipped from one port to another within D days. The i-th package on the conveyor belt has a weight of weights[i]. Each day, we load the ship with packages on the conveyor belt (in the order given by weights). We may not load more weight than the maximum weight capacity of the ship.

Return the least weight capacity of the ship that will result in all the packages on the conveyor belt being shipped within D days.

Example :

Input: weights = [1,2,3,4,5,6,7,8,9,10], D = 5 Output: 15 Explanation: A ship capacity of 15 is the minimum to ship all the packages in 5 days like this: 1st day: 1, 2, 3, 4, 5 2nd day: 6, 7 3rd day: 8 4th day: 9 5th day: 10

Note that the cargo must be shipped in the order given, so using a ship of capacity 14 and splitting the packages into parts like (2, 3, 4, 5), (1, 6, 7), (8), (9), (10) is not allowed. Binary search probably would not come to our mind when we first meet this problem. We might automatically treat weights as search space and then realize we’ve entered a dead end after wasting lots of time. In fact, we are looking for the minimal one among all feasible capacities. We dig out the monotonicity of this problem: if we can successfully ship all packages within D days with capacity m, then we can definitely ship them all with any capacity larger than m. Now we can design a condition function, let’s call it feasible, given an input capacity, it returns whether it’s possible to ship all packages within D days. This can run in a greedy way: if there’s still room for the current package, we put this package onto the conveyor belt, otherwise we wait for the next day to place this package. If the total days needed exceeds D, we return False, otherwise we return True.

Next, we need to initialize our boundary correctly. Obviously capacity should be at least max(weights), otherwise the conveyor belt couldn’t ship the heaviest package. On the other hand, capacity need not be more thansum(weights), because then we can ship all packages in just one day.

Now we’ve got all we need to apply our binary search template:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def shipWithinDays(weights: List[int], D: int) -> int:
    def feasible(capacity) -> bool:
        days = 1
        total = 0
        for weight in weights:
            total += weight
            if total > capacity:  # too heavy, wait for the next day
                total = weight
                days += 1
                if days > D:  # cannot ship within D days
                    return False
        return True

    left, right = max(weights), sum(weights)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

This problem is about finding the minimum ship weight capacity that can ship all the packages within D days. It’s a great example of how binary search can be applied in a scenario that doesn’t initially seem related to binary search.

The binary search works by starting with a lower limit (left) of the maximum weight in the array (because any viable ship capacity must be able to handle at least the heaviest package) and an upper limit (right) of the total weight of all the packages (because any viable ship capacity must be able to handle all packages at once, hypothetically).

Then we run a binary search, where in each iteration, we check if it is feasible to ship all the packages within D days using the current mid-value as the ship capacity. This is done by simulating the shipping process with a greedy algorithm: we keep loading packages onto the current day until we can’t fit the next package, then we move on to the next day, and so on. If we can ship all the packages within D days, this means the current ship capacity is feasible.

The feasible function iterates through the weights of the packages. For each package, it adds the package’s weight to the total weight for the current day. If the total weight exceeds the capacity, it resets the total weight to the current package’s weight and increments the day count. If the total day count exceeds D, it means the capacity is not enough to ship all the packages within D days, so it returns False. Otherwise, it returns True indicating that the current capacity is sufficient.

The binary search keeps narrowing down the search space until left equals right, at which point it returns left (or right), which is the minimum viable ship capacity.

One key aspect of this problem is identifying the property that if a certain ship capacity can successfully ship all packages within D days, then any larger ship capacity can do so as well, which is the monotonicity needed for binary search to work. The trickiest part is probably coming up with the feasible function to check whether a certain ship capacity can do the job within D days. But once you’ve figured that out, the rest is just straightforward application of the binary search template.

  1. Split Array Largest Sum [Hard] Given an array which consists of non-negative integers and an integer m, you can split the array into m non-empty continuous subarrays. Write an algorithm to minimize the largest sum among these m subarrays.

Example:

Input: nums = [7,2,5,10,8] m = 2

Output: 18

Explanation: There are four ways to split nums into two subarrays. The best way is to split it into [7,2,5] and [10,8], where the largest sum among the two subarrays is only 18. If you take a close look, you would probably see how similar this problem is with LC 1011 above. Similarly, we can design a feasible function: given an input threshold, then decide if we can split the array into several subarrays such that every subarray-sum is less than or equal to threshold. In this way, we discover the monotonicity of the problem: if feasible(m) is True, then all inputs larger than m can satisfy feasible function. You can see that the solution code is exactly the same as LC 1011.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def splitArray(nums: List[int], m: int) -> int:        
    def feasible(threshold) -> bool:
        count = 1
        total = 0
        for num in nums:
            total += num
            if total > threshold:
                total = num
                count += 1
                if count > m:
                    return False
        return True

    left, right = max(nums), sum(nums)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid     
        else:
            left = mid + 1
    return left

But we probably would have doubts: It’s true that left returned by our solution is the minimal value satisfying feasible, but how can we know that we can split the original array to actually get this subarray-sum? For example, let’s say nums = [7,2,5,10,8] and m = 2. We have 4 different ways to split the array to get 4 different largest subarray-sum correspondingly: 25:[[7], [2,5,10,8]], 23:[[7,2], [5,10,8]], 18:[[7,2,5], [10,8]], 24:[[7,2,5,10], [8]]. Only 4 values. But our search space [max(nums), sum(nums)] = [10, 32] has much more that just 4 values. That is, no matter how we split the input array, we cannot get most of the values in our search space.

Let’s say k is the minimal value satisfying feasible function. We can prove the correctness of our solution with proof by contradiction. Assume that no subarray’s sum is equal to k, that is, every subarray sum is less than k. The variable total inside feasible function keeps track of the total weights of current load. If our assumption is correct, then total would always be less than k. As a result, feasible(k - 1) must be True, because total would at most be equal to k - 1 and would never trigger the if-clause if total > threshold, therefore feasible(k - 1) must have the same output as feasible(k), which is True. But we already know that k is the minimal value satisfying feasible function, so feasible(k - 1) has to be False, which is a contradiction. So our assumption is incorrect. Now we’ve proved that our algorithm is correct.

In this problem, we’re given an array of non-negative integers and a number m, and we’re asked to split the array into m non-empty continuous subarrays such that the largest sum among these m subarrays is minimized.

The approach is similar to the previous problem where we defined a feasible function. This function checks whether it’s possible to split the array into some subarrays such that the sum of each subarray is less than or equal to a given threshold. We then apply binary search to find the minimum threshold that makes feasible return True.

The binary search starts with the left boundary as the maximum element in the array and the right boundary as the sum of all elements in the array. We then iterate, in each round checking if the mid value is a feasible threshold and narrowing our search space accordingly. If feasible(mid) is True, then we can potentially find a lower threshold, so we update right to mid. Otherwise, we need a larger threshold, so we update left to mid + 1. The binary search ends when left equals right, and we return left as the minimum feasible threshold.

It’s important to note that feasible function doesn’t just test whether we can split the array into m subarrays with the sum of each subarray less than or equal to the threshold. It also implicitly ensures that the splits are done in such a way that the sum of the largest subarray is minimized, because we always try to put as many elements as possible into the current subarray before starting a new one.

The interesting part of this problem is the proof that our algorithm always returns a value that we can actually achieve by splitting the original array. Although our search space includes values that we can’t achieve by any splitting of the array, we can show by contradiction that the value returned by our algorithm is achievable. If it weren’t, then we would have been able to split the array in such a way that each subarray’s sum is less than the threshold, which would contradict the definition of our feasible function and the assumption that the returned value is the minimum feasible threshold.

  1. Koko Eating Bananas [Medium] Koko loves to eat bananas. There are N piles of bananas, the i-th pile has piles[i] bananas. The guards have gone and will come back in H hours. Koko can decide her bananas-per-hour eating speed of K. Each hour, she chooses some pile of bananas, and eats K bananas from that pile. If the pile has less than K bananas, she eats all of them instead, and won’t eat any more bananas during this hour.

Koko likes to eat slowly, but still wants to finish eating all the bananas before the guards come back. Return the minimum integer K such that she can eat all the bananas within H hours.

Example :

Input: piles = [3,6,7,11], H = 8 Output: 4 Input: piles = [30,11,23,4,20], H = 5 Output: 30 Input: piles = [30,11,23,4,20], H = 6 Output: 23 Very similar to LC 1011 and LC 410 mentioned above. Let’s design a feasible function, given an input speed, determine whether Koko can finish all bananas within H hours with hourly eating speed speed. Obviously, the lower bound of the search space is 1, and upper bound is max(piles), because Koko can only choose one pile of bananas to eat every hour.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def minEatingSpeed(piles: List[int], H: int) -> int:
    def feasible(speed) -> bool:
        # return sum(math.ceil(pile / speed) for pile in piles) <= H  # slower        
        return sum((pile - 1) // speed + 1 for pile in piles) <= H  # faster

    left, right = 1, max(piles)
    while left < right:
        mid = left  + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

This problem is similar to the previous ones where we defined a feasible function. The task is to find the minimum integer K that allows Koko to eat all bananas within H hours.

We start by defining our feasible function, which, given an input speed, determines whether Koko can finish all bananas within H hours. The function iterates over each pile and calculates the hours Koko would need to finish it at the given speed. It does this by dividing the pile’s size by the speed and rounding up to the nearest integer. If the total hours needed to finish all piles is less than or equal to H, the function returns True; otherwise, it returns False.

The feasible function can be implemented in two ways. The first way is to use the math.ceil function to round up the division result. However, this involves floating-point division and can be slower. The second way is to subtract 1 from the pile’s size before the division and then add 1 to the result, which effectively gives the same result as rounding up but only involves integer operations and is thus faster.

We then perform a binary search over the range of possible speeds, which are between 1 and the size of the largest pile. In each iteration, we check whether the middle speed is feasible. If it is, we update right to mid as we might find a lower feasible speed. Otherwise, we update left to mid + 1 as we need a higher speed. The binary search ends when left equals right, and we return left as the minimum feasible speed.

  1. Minimum Number of Days to Make m Bouquets [Medium] Given an integer array bloomDay, an integer m and an integer k. We need to make m bouquets. To make a bouquet, you need to use k adjacent flowers from the garden. The garden consists of n flowers, the ith flower will bloom in the bloomDay[i] and then can be used in exactly one bouquet. Return the minimum number of days you need to wait to be able to make m bouquets from the garden. If it is impossible to make m bouquets return -1.

Examples:

Input: bloomDay = [1,10,3,10,2], m = 3, k = 1 Output: 3 Explanation: Let’s see what happened in the first three days. x means flower bloomed and _ means flower didn’t bloom in the garden. We need 3 bouquets each should contain 1 flower. After day 1: [x, _, _, _, _] // we can only make one bouquet. After day 2: [x, _, _, _, x] // we can only make two bouquets. After day 3: [x, _, x, _, x] // we can make 3 bouquets. The answer is 3. Input: bloomDay = [1,10,3,10,2], m = 3, k = 2 Output: -1 Explanation: We need 3 bouquets each has 2 flowers, that means we need 6 flowers. We only have 5 flowers so it is impossible to get the needed bouquets and we return -1. Now that we’ve solved three advanced problems above, this one should be pretty easy to do. The monotonicity of this problem is very clear: if we can make m bouquets after waiting for d days, then we can definitely finish that as well if we wait for more than d days.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def minDays(bloomDay: List[int], m: int, k: int) -> int:
    def feasible(days) -> bool:
        bonquets, flowers = 0, 0
        for bloom in bloomDay:
            if bloom > days:
                flowers = 0
            else:
                bonquets += (flowers + 1) // k
                flowers = (flowers + 1) % k
        return bonquets >= m

    if len(bloomDay) < m * k:
        return -1
    left, right = 1, max(bloomDay)
    while left < right:
        mid = left + (right - left) // 2
        if feasible(mid):
            right = mid
        else:
            left = mid + 1
    return left

The problem is about determining the minimum number of days to wait to make m bouquets from a garden, where each bouquet consists of k adjacent flowers. We are given an array bloomDay where bloomDay[i] is the day the i-th flower will bloom.

The feasible function in this solution checks if it’s possible to make m bouquets after waiting for a given number of days. It goes through each flower in the garden and checks if it’s bloomed. If the flower hasn’t bloomed yet, it resets the number of flowers in the current bouquet. If the flower has bloomed, it tries to add it to the current bouquet and checks if the bouquet is complete. If the bouquet is complete, it starts a new one and increments the count of bouquets. If m or more bouquets can be made, the function returns True; otherwise, it returns False.

The first if statement before the binary search checks if there are enough flowers in the garden to make m bouquets. If not, it immediately returns -1.

The binary search range is from 1 to the maximum bloom day. In each iteration, it checks if it’s feasible to make m bouquets after waiting for the middle number of days. If it is, it updates right to mid because a smaller number of days might also be feasible. If it’s not feasible, it updates left to mid + 1 because we need to wait for more days. The binary search ends when left equals right, and the function returns left as the minimum number of days to wait to make m bouquets.

  1. Kth Smallest Number in Multiplication Table [Hard] Nearly every one have used the Multiplication Table. But could you find out the k-th smallest number quickly from the multiplication table? Given the height m and the length n of a m * n Multiplication Table, and a positive integer k, you need to return the k-th smallest number in this table.

Example :

Input: m = 3, n = 3, k = 5 Output: 3 Explanation: The Multiplication Table: 1 2 3 2 4 6 3 6 9

The 5-th smallest number is 3 (1, 2, 2, 3, 3). For Kth-Smallest problems like this, what comes to our mind first is Heap. Usually we can maintain a Min-Heap and just pop the top of the Heap for k times. However, that doesn’t work out in this problem. We don’t have every single number in the entire Multiplication Table, instead, we only have the height and the length of the table. If we are to apply Heap method, we need to explicitly calculate these m * n values and save them to a heap. The time complexity and space complexity of this process are both O(mn), which is quite inefficient. This is when binary search comes in. Remember we say that designing condition function is the most difficult part? In order to find the k-th smallest value in the table, we can design an enough function, given an input num, determine whether there’re at least k values less than or equal to num. The minimal num satisfying enough function is the answer we’re looking for. Recall that the key to binary search is discovering monotonicity. In this problem, if num satisfies enough, then of course any value larger than num can satisfy. This monotonicity is the fundament of our binary search algorithm.

Let’s consider search space. Obviously the lower bound should be 1, and the upper bound should be the largest value in the Multiplication Table, which is m * n, then we have search space [1, m * n]. The overwhelming advantage of binary search solution to heap solution is that it doesn’t need to explicitly calculate all numbers in that table, all it needs is just picking up one value out of the search space and apply enough function to this value, to determine should we keep the left half or the right half of the search space. In this way, binary search solution only requires constant space complexity, much better than heap solution.

Next let’s consider how to implement enough function. It can be observed that every row in the Multiplication Table is just multiples of its index. For example, all numbers in 3rd row [3,6,9,12,15…] are multiples of 3. Therefore, we can just go row by row to count the total number of entries less than or equal to input num. Following is the complete solution.

def findKthNumber(m: int, n: int, k: int) -> int:
    def enough(num) -> bool:
        count = 0
        for val in range(1, m + 1):  # count row by row
            add = min(num // val, n)
            if add == 0:  # early exit
                break
            count += add
        return count >= k                

    left, right = 1, n * m
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left

In LC 410 above, we have doubt “Is the result from binary search actually a subarray sum?”. Here we have a similar doubt: “Is the result from binary search actually in the Multiplication Table?”. The answer is yes, and we also can apply proof by contradiction. Denote num as the minimal input that satisfies enough function. Let’s assume that num is not in the table, which means that num is not divisible by any val in [1, m], that is, num % val > 0. Therefore, changing the input from num to num - 1 doesn’t have any effect on the expression add = min(num // val, n). So enough(num - 1) would also return True, same as enough(num). But we already know num is the minimal input satisfying enough function, so enough(num - 1) has to be False. Contradiction! The opposite of our original assumption is true: num is actually in the table.

The problem is asking to find the k-th smallest number in a multiplication table of size m x n. To solve this problem, we are using binary search, but unlike conventional usage where binary search is used to find an element in a sorted list, here it is used to find the smallest number such that there are at least k numbers in the multiplication table that are less than or equal to it.

The binary search is conducted between 1 and m*n, which are the smallest and largest elements in the multiplication table, respectively. For each number in this range, we use the enough function to check if there are at least k numbers in the multiplication table that are smaller or equal to it.

The enough function iterates over each row in the multiplication table. Since each row in the table is a multiplication of its index by numbers ranging from 1 to n, the i-th row’s elements can be seen as i*1, i*2, ..., i*n. Therefore, if we are checking a number x, the number of elements in the i-th row that are less than or equal to x is min(x//i, n). If x is less than i, it means that there is no number in the i-th row that is less than or equal to x, hence we can break from the loop.

Once we have checked all rows and if the count is greater or equal to k, we return True; otherwise, we return False. In the binary search, if enough of the mid number is True, we make right equal to mid, else left becomes mid + 1.

This solution has a time complexity of O(m*log(m*n)) and space complexity of O(1).

Also, the binary search guarantees that the returned number is in the multiplication table. This can be proved by contradiction. Let’s assume that the number num returned by the binary search is not in the table. It means num is not divisible by any of the numbers between 1 and m. Therefore, num // i would be the same as (num-1) // i. It means enough(num) would be the same as enough(num-1), which contradicts the fact that num is the smallest number satisfying the enough function. Therefore, num must be in the table.

  1. Find K-th Smallest Pair Distance [Hard] Given an integer array, return the k-th smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

Example :

Input: nums = [1,3,1] k = 1 Output: 0 Explanation: Following are all the pairs. The 1st smallest distance pair is (1,1), and its distance is 0. (1,3) -> 2 (1,1) -> 0 (3,1) -> 2 Very similar to LC 668 above, both are about finding Kth-Smallest. Just like LC 668, We can design an enough function, given an input distance, determine whether there’re at least k pairs whose distances are less than or equal to distance. We can sort the input array and use two pointers (fast pointer and slow pointer, pointed at a pair) to scan it. Both pointers go from leftmost end. If the current pair pointed at has a distance less than or equal to distance, all pairs between these pointers are valid (since the array is already sorted), we move forward the fast pointer. Otherwise, we move forward the slow pointer. By the time both pointers reach the rightmost end, we finish our scan and see if total counts exceed k. Here is the implementation:

1
2
3
4
5
6
7
8
def enough(distance) -> bool:  # two pointers
    count, i, j = 0, 0, 0
    while i < n or j < n:
        while j < n and nums[j] - nums[i] <= distance:  # move fast pointer
            j += 1
        count += j - i - 1  # count pairs
        i += 1  # move slow pointer
    return count >= k

Obviously, our search space should be [0, max(nums) - min(nums)]. Now we are ready to copy-paste our template:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def smallestDistancePair(nums: List[int], k: int) -> int:
    nums.sort()
    n = len(nums)
    left, right = 0, nums[-1] - nums[0]
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left

This problem is asking to find the k-th smallest distance among all the pairs of elements in the given list. The distance between a pair (A, B) is defined as the absolute difference between A and B.

To solve this problem, we can utilize binary search similar to the previous problem. However, this time, the binary search is conducted in the range of possible distances. The smallest distance can be 0 (when there are duplicate numbers) and the largest can be the difference between the maximum and minimum numbers in the array.

So, the binary search space becomes [0, max(nums) - min(nums)].

In each step of the binary search, we need a function, enough(distance), to check if there are at least k pairs with a distance less or equal to distance. We do this by sorting the array and then using two pointers to iterate over the pairs. If the distance of the current pair is less than or equal to distance, we move the right pointer (j) to the right. If it’s more than distance, we move the left pointer (i) to the right. The total number of pairs with a distance less or equal to distance is incremented by j - i - 1.

Finally, the binary search returns the smallest distance that satisfies the condition enough(distance) == k.

This solution has a time complexity of O(n log n) for sorting and O(n log(max(nums) - min(nums))) for binary search, and a space complexity of O(1).

  1. Ugly Number III [Medium] Write a program to find the n-th ugly number. Ugly numbers are positive integers which are divisible by a or b or c.

Example :

Input: n = 3, a = 2, b = 3, c = 5 Output: 4 Explanation: The ugly numbers are 2, 3, 4, 5, 6, 8, 9, 10… The 3rd is 4. Input: n = 4, a = 2, b = 3, c = 4 Output: 6 Explanation: The ugly numbers are 2, 3, 4, 6, 8, 9, 10, 12… The 4th is 6. Nothing special. Still finding the Kth-Smallest. We need to design an enough function, given an input num, determine whether there are at least n ugly numbers less than or equal to num. Since a might be a multiple of b or c, or the other way round, we need the help of greatest common divisor to avoid counting duplicate numbers.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def nthUglyNumber(n: int, a: int, b: int, c: int) -> int:
    def enough(num) -> bool:
        total = num//a + num//b + num//c - num//ab - num//ac - num//bc + num//abc
        return total >= n

    ab = a * b // math.gcd(a, b)
    ac = a * c // math.gcd(a, c)
    bc = b * c // math.gcd(b, c)
    abc = a * bc // math.gcd(a, bc)
    left, right = 1, 10 ** 10
    while left < right:
        mid = left + (right - left) // 2
        if enough(mid):
            right = mid
        else:
            left = mid + 1
    return left

This problem is a variant of the classic ugly number problem. An ugly number is a number that can be factored entirely into a set of primes. Here, the primes are given by a, b, and c.

The challenge is to find the n-th ugly number, i.e., the n-th number that is divisible by a, b, or c.

This problem can be solved using a binary search approach similar to the previous problems. However, the main difference is how we define the condition to check during the binary search.

We need a function enough(num) to check if there are at least n ugly numbers that are less than or equal to num. We can find the number of ugly numbers less than or equal to num by summing up how many numbers are divisible by a, b, c, and subtracting the numbers that are divisible by their pairwise least common multiples (ab, ac, bc), and adding back the numbers divisible by their triple least common multiple (abc), to avoid over-counting.

To find the least common multiple (LCM) of two numbers, we use the formula: lcm(a, b) = a * b / gcd(a, b). Here, gcd is the greatest common divisor.

Finally, the binary search is conducted over a range [1, 10 ** 10]. This is because the n-th ugly number is guaranteed to be less or equal to 2 * 10 ** 9 and all ugly numbers are positive. So, this range will definitely cover the n-th ugly number.

This solution has a time complexity of O(log n) for the binary search and a space complexity of O(1).

  1. Find the Smallest Divisor Given a Threshold [Medium] Given an array of integers nums and an integer threshold, we will choose a positive integer divisor and divide all the array by it and sum the result of the division. Find the smallest divisor such that the result mentioned above is less than or equal to threshold.

Each result of division is rounded to the nearest integer greater than or equal to that element. (For example: 7/3 = 3 and 10/2 = 5). It is guaranteed that there will be an answer.

Example :

Input: nums = [1,2,5,9], threshold = 6 Output: 5 Explanation: We can get a sum to 17 (1+2+5+9) if the divisor is 1. If the divisor is 4 we can get a sum to 7 (1+1+2+3) and if the divisor is 5 the sum will be 5 (1+1+1+2). After so many problems introduced above, this one should be a piece of cake. We don’t even need to bother to design a condition function, because the problem has already told us explicitly what condition we need to satisfy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def smallestDivisor(nums: List[int], threshold: int) -> int:
    def condition(divisor) -> bool:
        return sum((num - 1) // divisor + 1 for num in nums) <= threshold

    left, right = 1, max(nums)
    while left < right:
        mid = left + (right - left) // 2
        if condition(mid):
            right = mid
        else:
            left = mid + 1
    return left

This problem is another instance where a binary search can be used to find the optimal value to satisfy a given condition.

The key idea in this problem is to find the smallest divisor such that the sum of the division results (each rounded up to the nearest integer) is less than or equal to a given threshold.

We can define a condition function that, given a divisor, computes the sum of the division results for each number in the array, and checks if this sum is less than or equal to the threshold.

The binary search is then conducted over the range [1, max(nums)]. The lower bound is 1, since the divisor must be a positive integer. The upper bound is max(nums), since dividing by a number larger than the maximum number in the array would result in all division results being 1, which would give the minimum possible sum. So, the optimal divisor must be within this range.

During each step of the binary search, we check if the condition is satisfied with the current mid value. If it is, we can continue searching in the left half of the range, since we want to find the smallest divisor that satisfies the condition. Otherwise, we search in the right half.

The binary search stops when the search range becomes empty, i.e., when left == right. At this point, left (or right) is the smallest divisor that satisfies the condition, so we return it.

The time complexity of this solution is O(n log m), where n is the length of the array and m is the maximum number in the array. This is because in the worst case, we have to compute the sum for each number in the array for each step of the binary search. The space complexity is O(1), since we only need a constant amount of space to store the array and the binary search variables.

All we need is just more practice to build up our ability to discover the monotonicity of the problem and to design a beautiful condition function.

Absolutely! Practice is key in enhancing problem-solving skills, especially when it comes to identifying patterns or underlying principles, such as the concept of monotonicity in binary search problems.

Monotonicity, in this context, refers to the property of a function to always increase or always decrease. Recognizing monotonicity can greatly simplify the problem-solving process, because if a function is monotonic, you can use binary search to quickly find the solution.

Designing a condition function is another crucial part of solving binary search problems. This function should encapsulate the conditions that a potential solution must satisfy, and it should return a boolean indicating whether a given value meets these conditions.

Keep practicing and challenging yourself with different types of problems. Over time, you’ll become more proficient at spotting the use of binary search and designing condition functions.

Why do this instead of:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
left, right = 0, n - 1
while left <= right:
    mid = (left + right) // 2
    if condition(mid) indicates go left:
        right = mid - 1
    elif condition(mid) indicates go right:
        left = mid + 1
    # we've found the candidate
    else:
        return mid 

The above also works, but trying to understand why your solution is preferred, and I’m failing to. Doesn’t mean your solution is not better, I just don’t think I’m taking certain edge cases into account.

Great question! Both methods can work, but they address slightly different scenarios and have slightly different edge-case behaviors. The version you posted is usually referred to as the “classic binary search” where you’re looking for a specific element in the array.

The version I provided tends to be a little more robust for problems where we’re searching for a particular condition to be satisfied (rather than a specific element), and we care about finding the boundary where that condition changes. In other words, the version I provided is often used for binary search in the “answer space” or “decision space”, instead of the “input space”.

Also, in the version you posted, if the condition function doesn’t return an explicit “found” condition, but only “go left” or “go right”, the loop can get stuck in a situation where left == right but neither of them is the target we’re looking for. In that case, the loop doesn’t terminate and we end up in an infinite loop.

In the method I provided, while left < right: guarantees that when we exit the loop, left == right, and it ensures we have found an answer that satisfies the condition. We don’t have to return mid, because we know left and right have converged to the solution.

In summary, the version you posted works well when you’re searching for a specific element in an array, while the version I provided works well when you’re searching for a position in the “answer space” where a condition changes from false to true or true to false. The version I provided is often more robust for these more abstract binary search problems, but both versions are useful and important to understand.

How Are you Converging the Answer to left Pointer? Is there any Proof for that, What Should We Say If Interviewer Asks that? Please Help me Understand that, Thanks in advance.

The convergence of the answer to the “left” pointer is due to the properties of the binary search algorithm and the nature of the condition we’re checking in these problems.

Let’s review the binary search algorithm briefly:

  • It starts with a range [left, right] that contains a possible solution.
  • It evaluates the condition at the midpoint.
  • If the midpoint satisfies the condition, it can disregard all values to the right (since we’re looking for the smallest such number), and so it moves the “right” pointer to mid.
  • If the midpoint doesn’t satisfy the condition, it can disregard all values from “left” up to and including the midpoint, and so it moves the “left” pointer to mid + 1.
  • It repeats this process until the range has been narrowed down to a single element.

Because of the way binary search works, when the while loop ends, the “left” and “right” pointers will always be at the smallest value that satisfies the condition (if one exists).

Let’s go through a short example to illustrate this. Assume we’re searching for the smallest number in the range 1 to 10 that is greater than or equal to 6. This table shows the values of “left”, “right”, and “mid” at each step:

left  right  mid
1     10     5
6     10     8
6     8      7
6     7      6

At this point, “left” and “right” have converged to 6, which is the smallest number in the range that is greater than or equal to 6. So in this case, the algorithm correctly found the answer.

Remember, binary search is based on the principle of reducing the search space by half at each step, based on some condition. If the condition is such that for all values x in the search space, if x satisfies the condition then all values greater than x also satisfy the condition (which is what “monotonic” means), then binary search will always converge to the smallest (or largest, depending on the exact algorithm) value that satisfies the condition.

So, when your interviewer asks, you can explain that the answer converges to the “left” pointer because of the properties of binary search and the specific condition you’re checking. For all values in the search space, if a value satisfies the condition, then all values to the right also satisfy the condition. This is why the “left” pointer always ends up at the smallest value that satisfies the condition when the loop ends.

For problem 875. koko-eating-bananas, writing the feasible function this way makes more sense,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def feasible(K):
    countHours = 0  # hours take to eat all bananas

    for pile in piles:
        countHours += pile // K
        if pile % K != 0:
            countHours += 1

        if countHours > h:
            return False

    return True

Yes, your feasible function for problem 875 is also correct and actually makes the code more readable and straightforward. Here is what it does:

  • For each pile of bananas, it calculates how many hours Koko would take to eat all bananas in the pile at the current eating speed K.
  • If a pile is not a multiple of K, it means Koko needs an extra hour to finish this pile. This is because Koko can eat up to K bananas per hour, so if there are some bananas left after eating K bananas per hour, she needs an extra hour.
  • If the total hours countHours exceeds H (the total hours available), the function returns False, because it’s not feasible for Koko to eat all the bananas within H hours at the speed K.
  • If countHours is less than or equal to H for all piles, the function returns True, because Koko can eat all the bananas within H hours at the speed K.

This version of the feasible function is a bit longer than the one-liner sum((pile - 1) // K + 1 for pile in piles) <= H, but it’s easier to understand and clearly shows the logic behind the calculation. So this is a great way to write the function if clarity is a priority. However, keep in mind that it has the same time complexity O(n), where n is the number of piles, as the one-liner version.

Your function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def feasible(K):
    countHours = 0  # hours take to eat all bananas

    for pile in piles:
        countHours += pile // K
        if pile % K != 0:
            countHours += 1

        if countHours > H:
            return False

    return True

And the one-liner version:

1
2
def feasible(K):
    return sum((pile - 1) // K + 1 for pile in piles) <= H

They are essentially doing the same thing, just expressed in different ways. Your version explicitly computes countHours and checks it against H inside the loop, while the one-liner version does the check after the loop has computed the sum.

Problem: I literally couldn’t make a condition function for like 99 out of 100 questions. Any good ways to create a condition function?

Creating a condition function for a problem is indeed a crucial part of many algorithmic solutions, especially in Binary Search problems. However, the approach to designing this function can greatly vary depending on the problem at hand. Here are some general steps you can follow:

  1. Understand the Problem: Before you can form a condition function, you need to fully understand the problem. What is it that you’re looking for? What condition does the answer need to meet?

  2. Identify Monotonicity: In many problems, especially those where binary search can be applied, there is a property of monotonicity. In other words, if a certain condition holds for a certain value, then it will also hold for all values larger (or smaller) than this value. Identifying this property can often help in designing a condition function.

  3. Formulate the Condition: The condition function is generally a logical condition that returns True or False. It is based on the requirements of the problem. For example, in a binary search problem where you need to find the minimum value that meets a certain criteria, your condition function should return True for this value and all values larger than it, and False for all values smaller than it.

  4. Use Problem Constraints: The constraints given in the problem statement often guide the formation of the condition function. They can help you identify what variables or values to consider, and how they should be compared.

  5. Test and Refine: Finally, once you have a condition function, test it on a variety of cases to make sure it behaves as expected. You might need to refine it or change your approach based on these results.

Remember, creating a condition function requires both understanding of the problem and creativity. It is often not a straightforward process, but with practice, you can get better at it. One good way to practice is by solving more problems and studying the solutions to understand how the condition functions are created. You can start with easier problems and gradually move to harder ones as your understanding and skills improve.

The condition is usually to minimize or maximize the given variable within our search space. So the condition will depend on that and will differ from problem to problem. Don’t think there will be a hardset way to define that.

Yes, you’re correct. The condition function is inherently problem-specific and depends on the objective of the problem, whether it’s minimization or maximization, or other specific conditions you need to satisfy. However, despite its dependence on the problem, there are still some guiding principles for creating a condition function.

Here are some general steps to create a condition function:

  1. Identify what you’re trying to minimize or maximize: Look at the problem statement and figure out what you’re trying to achieve. Are you trying to find the smallest number that satisfies a certain property? The largest? The exact value?

  2. Formulate a function that checks the condition: This is usually a boolean function that returns true if the condition is satisfied and false otherwise. This function should take in a candidate solution and return a boolean.

  3. Use the condition function to guide the search: If you’re using binary search, for example, the condition function can tell you whether to go left or right. If you’re trying to minimize something, you would generally continue the search in the direction where the condition function returns true.

As you mentioned, there is no hard and fast rule for defining a condition function, but these steps provide a good framework to approach it. With practice, you’ll get a better feel for defining condition functions effectively.

if right itself is the minimal k satisfying the condition and shouldnt we return right? how will the left pointer ever reach right?

In the binary search algorithm I’ve outlined, “right” is set to be one more than the maximum possible value in our search space. When we find a mid value that satisfies our condition, we update “right” to be equal to “mid”, so “right” becomes the smallest value that has been observed to satisfy the condition.

During the binary search, we keep narrowing the search space by adjusting the “left” or “right” pointers. When “left” pointer and “right” pointer meet, they point to the smallest value that satisfies the condition.

Why does the “left” pointer reach “right”? Because if “mid” satisfies the condition, we set “right” to “mid”. If “mid” does not satisfy the condition, we set “left” to “mid+1”. In either case, the gap between “left” and “right” is reduced by at least 1 in each iteration. So eventually, “left” will equal “right”.

The reason we return “left” in the end instead of “right” is just a convention, because at the end of the loop, “left” and “right” are guaranteed to be equal. So we could return either “left” or “right”.

Doesn’t seem to work for 1891. Cutting Ribbons, the mid would stuck and not incrementing.

The binary search approach should work well for problem 1891. Cutting Ribbons as well. The goal is to maximize the length of the smallest piece you can cut. Here’s how you might implement it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Solution:
    def maxLength(self, ribbons: List[int], k: int) -> int:
        def enough(mid) -> bool:
            return sum(ribbon // mid for ribbon in ribbons) >= k

        left, right = 1, max(ribbons)
        while left < right:
            mid = (left + right + 1) // 2  # Mid is leaning right
            if enough(mid):
                left = mid
            else:
                right = mid - 1
        return left if enough(left) else 0

In this problem, we are trying to maximize the mid, so the mid is leaning to the right, and we update left to mid when the mid length satisfies our condition (we can cut at least k ribbons).

If we can’t cut any ribbon into pieces with length at least 1 (the scenario when k is greater than the sum of all ribbons), the function will return 0.

I still don’t get this. “Remember this: after exiting the while loop, left is the minimal k satisfying the condition function”. Can you give me a more detailed explanation?

Sure, I can give you a more detailed explanation.

The binary search algorithm works by repeatedly dividing in half the portion of the array that could contain the target value. The key idea is to maintain a range [left, right] that is guaranteed to contain the target value, and adjust the range based on the comparison of the middle value with the target value.

The invariant we’re maintaining here is that our answer lies in the range [left, right]. For each iteration, we examine the middle value mid, which is calculated as (left + right) / 2. There are generally two scenarios:

  1. The mid value satisfies the condition. In this case, mid could be a potential answer. However, we want to find the minimum value that satisfies the condition, so we continue our search in the range [left, mid]. This is because there could be smaller values that satisfy the condition to the left of mid.

  2. The mid value doesn’t satisfy the condition. In this case, we know that all values to the left of mid can’t be the answer because they are even smaller than mid and won’t satisfy the condition. So we continue our search in the range [mid + 1, right].

This process continues until left equals right. When the loop ends, left (or right as they are equal) is the minimal value satisfying the condition, because:

  • If there is a value smaller than left that satisfies the condition, the range [left, right] won’t exclude it, which contradicts the fact that left is the smallest.
  • Left satisfies the condition, because if it didn’t, it would have been excluded in the second scenario.

Hence, at the end of the binary search process, left is indeed the minimal k satisfying the condition function.

Remember, the key to understanding this explanation lies in grasping the invariant concept, which is that the answer is always within the [left, right] range during the binary search. This range is effectively narrowed down with each iteration until we find the exact answer.

Would you mind explaining to me when to use while condition left < right and left <= right? Thanks!

The choice of whether to use left < right or left <= right in the condition for the while loop of binary search depends on how you’re updating the left and right pointers and what you’re searching for.

  1. while(left <= right): This is typically used in basic binary search where you’re searching for a specific target value in a sorted array. You keep going until left and right cross each other, which indicates that the target isn’t in the array. After the loop, if you didn’t find the target and returned from within the loop, you can say for certain that the target is not in the input array.

  2. while(left < right): This condition is typically used in binary search problems where you’re looking for an optimal value rather than a specific target value, like in a binary search for the first or last true in a boolean function. Here you’re guaranteed that a valid answer exists, so you don’t need left and right to cross each other. You keep narrowing down the range until left and right are adjacent or meet each other, at which point you’ve honed in on your answer. After the loop, you typically return left (or right as they should be the same), which is your answer.

Remember, these are just guidelines and the actual conditions might need tweaking based on the specifics of the problem at hand.

I tried this on Removable Characters, but it didn’t work,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution:
    def maximumRemovals(self, s, p,removable):
        def possible(k):
            j=0
            banned=set(removable[:k])
            for i ,c in enumerate(s):
                if i in banned:
                    continue
                if c==p[j]:
                    j+=1
                    if j==len(p):
                        return True
            return False

        left,right=0,len(removable)
        while left<right:
            mid=left + (right-left) //2
            if not possible(mid):
                right=mid
            else:
                left=mid+1
        return left-1

could you tell me if I m missing something in regard to maximum value of k

Your binary search approach is correct, but the way you are checking if an index is banned is incorrect. You are checking if i (the index in s) is in the banned set. However, you’re adding indices of the removable list to the banned set, not the indices from s.

To fix this, you should map the indices in removable to s before adding them to the banned set. Here’s how you can do it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution:
    def maximumRemovals(self, s, p, removable):
        def possible(k):
            j = 0
            banned = set(s[i] for i in removable[:k])  # ban characters, not indices
            for c in s:
                if c in banned:
                    continue
                if c == p[j]:
                    j += 1
                    if j == len(p):
                        return True
            return False

        left, right = 0, len(removable)
        while left < right:
            mid = left + (right - left) // 2
            if not possible(mid):
                right = mid
            else:
                left = mid + 1
        return left - 1

In this corrected code, we change the banned set to contain characters from s that are at the indices specified by the first k elements of removable. This way, when we iterate over s, we can correctly skip characters that are banned.

If anybody is having problem understanding 1482. Bloomday Question , I have coded the above feasible function again , as I had little trouble that what this // and % really means in feasible function ,so back to if :)

Thanks a ton for uploading this template !!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def minDays(self, bloomDay,m,k):
        def feasible(days):
            bouquets, flowers = 0, 0
            for bloom in bloomDay:
                if bloom > days:
                    flowers = 0
                else:
                    flowers+=1
                    if flowers>=k:
                        flowers=0
                        bouquets+=1
                        if bouquets==m:
                            return True
            return bouquets >= m

        if len(bloomDay) < m * k:
            return -1
        left, right = 1, max(bloomDay)
        while left < right:
            mid = left + (right - left) // 2
            if feasible(mid):
                right = mid
            else:
                left = mid + 1
        return left

Yes, that’s a valid way to write the feasible function for the problem as well. The original feasible function using // and % was trying to optimize the process a bit by calculating the number of flowers and bouquets in one go, but it might be harder to understand.

Here’s a quick explanation of the original feasible function for clarity:

  1. bouquets = flowers // k: This line calculates how many complete bouquets can be made from the current number of flowers. The // operator performs integer (floor) division.

  2. flowers %= k: This line updates the number of flowers to the remainder of flowers // k. The % operator gives the remainder of the division.

These two operations together essentially divide the number of flowers into complete bouquets (flowers // k), and keep track of the leftover flowers that can’t form a complete bouquet (flowers % k).

Your alternative function, however, takes a more intuitive approach by directly counting the number of flowers and forming a bouquet whenever enough flowers have bloomed. It’s a great demonstration of how the same logic can be implemented in different ways.

I have solved all the problems mentioned in the article but still don’t feel confident about while writing condition method, where while comparing with a threshold should I include = or not. while returning true/false, how to check it will be <= threshold, not sure about = part.

Your concern is a common one when learning binary search problems, and it’s perfectly normal to feel unsure at first.

To determine whether to use <= or < in the condition function, it often depends on the specific problem and how it’s worded.

Here’s a general approach to help you decide:

  1. Read the problem statement carefully and try to understand what the condition function is checking.

  2. Determine what the boundary conditions are (i.e., the smallest or largest possible values that satisfy the condition).

  3. Test your condition function with these boundary cases to ensure it behaves as expected.

  4. Adjust the <= or < in your condition function as needed based on the results of your tests.

Remember that the condition function is just a way of implementing the logic of the problem. It doesn’t have a fixed structure or formula that applies to all problems, so it’s okay to adjust it as needed.

In terms of the threshold, typically you want to return True if the calculated value is less than or equal to (<=) the threshold, as it indicates that the current choice can meet the requirement. For instance, if the task is to minimize a certain value to be under a threshold, if you get a value equal to the threshold, it’s still a valid choice. So, it would be safer to include the equal sign (=).

However, again, it will depend on the specific requirements of the problem. Always refer back to the problem statement when you are uncertain. Practice and experience with a variety of problems will also help you get more comfortable with making these decisions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public int mySqrt(int x) {
        if (x == Integer.MAX_VALUE) x -= 1;
        long l = 0, r = x + 1;
        long num = (long) x;
        while (l < r) {
            long mid = l + (r - l) / 2;
            if (mid * mid >  num) r = mid;
            else l = mid + 1;
        }
        return (int) l - 1 ;
    }

Your implementation of binary search for finding the square root seems to be correct. However, there’s an edge case you need to take care of: if x is equal to 0 or 1, your function will return -1, which is incorrect. This is because of the line return (int) l - 1;. In these cases, l will be 0 or 1, and subtracting 1 from it will result in -1.

A quick fix would be to add a special case for these inputs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
public int mySqrt(int x) {
    if (x == 0 || x == 1) return x;
    if (x == Integer.MAX_VALUE) x -= 1;
    long l = 0, r = x + 1;
    long num = (long) x;
    while (l < r) {
        long mid = l + (r - l) / 2;
        if (mid * mid > num) r = mid;
        else l = mid + 1;
    }
    return (int) l - 1 ;
}

This way, if x is 0 or 1, the function will return the correct answer right away. For larger values of x, the binary search will take over.

mid, mid+1 goes with (r + l )/2 and mid, mid - 1 goes with (r+l+1)/2

The expressions mid = (r + l) / 2 and mid = (r + l + 1) / 2 are used to find the middle element in binary search operations. The specific usage of each expression, however, depends on how you’ve set up your binary search boundaries and condition checks.

  1. mid = (r + l) / 2 : This is the standard mid-point calculation used in binary search. It gives the integer division of the sum of r and l, thus providing the index of the middle element. This is typically used when your search space is inclusive [l, r] (both ends included), and you use a condition like while(l <= r).

  2. mid = (r + l + 1) / 2: This mid-point calculation is used when you want to lean towards the right in scenarios of even length. In other words, it’s a “right-biased” mid-point. This is typically used when your search space is exclusive [l, r) (right end not included), and you use a condition like while(l < r).

So, the use of mid, mid+1, or mid-1 in your binary search implementation depends on the problem you’re solving, the bounds of your search space, and the specifics of your condition checks. There’s no strict rule that says (r + l) / 2 should always be paired with mid, mid+1 or (r + l + 1) / 2 should always be paired with mid, mid-1. It’s rather about understanding the problem, your search space, and deciding the appropriate boundaries for your specific use case.

Why is value of identifier “right” is taken one greater than required like if “left , right = 0, n+1 " , why can’t it be just n?

The way you define the right value depends on whether your search range is inclusive or exclusive.

When we do left, right = 0, n+1, we’re defining an exclusive upper bound for the search space, meaning n+1 is not considered in the binary search. This can make the loop and exit conditions simpler because you can use left < right as the loop condition, and when the loop exits, left is guaranteed to be the first value that does not violate the condition function.

In contrast, if we do left, right = 0, n, this implies that n is included in the search space, and we’re defining an inclusive upper bound. In this case, the loop condition would need to be left <= right to include n in the search.

For some problems, an inclusive bound might be more intuitive, but for other problems, an exclusive bound might simplify the loop and exit conditions. Both approaches are correct; it just depends on the specific problem and your personal preference.