Longest Increasing Subsequence

The problem is to find the length of the longest strictly increasing subsequence in an array of integers.

Here’s a simple explanation of the code to solve this problem:

  1. Initialize an array dp of the same length as nums and fill it with 1. This array will keep track of the length of the longest increasing subsequence ending at each position.
  2. Nested Loop: For each pair of elements i and j with i < j, check if nums[i] < nums[j]. If so, update dp[j] to be the maximum of its current value and dp[i] + 1.
  3. Find the Maximum: The length of the longest increasing subsequence is the maximum value in dp.

Here’s the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        if not nums:
            return 0

        dp = [1] * len(nums)

        for i in range(len(nums)):
            for j in range(i+1, len(nums)):
                if nums[i] < nums[j]:
                    dp[j] = max(dp[j], dp[i] + 1)

        return max(dp)

This code uses a dynamic programming approach to efficiently find the length of the longest increasing subsequence. The time complexity of this approach is (O(n^2)), and the space complexity is (O(n)), where (n) is the length of the nums array.

@param {Integer[]} nums

@return {Integer}

def length_of_lis(nums) return 0 if nums.empty?

dp = Array.new(nums.size, 1)

for i in (1..nums.size-1)
   for j in (0..i)
      if nums[j] < nums[i]
         dp[i] = [dp[i], dp[j]+1].max 
      end
   end
end

dp.max

end

Identifying Problem Isomorphism

“Longest Increasing Subsequence” asks you to find the length of the longest strictly increasing subsequence in the given array nums.

An isomorphic problem to this is the “Russian Doll Envelopes”, which asks you to find the maximum number of envelopes you can Russian doll (i.e., put one inside the other).

The structure of both problems is to find a longest increasing sequence. In “Longest Increasing Subsequence”, it’s about finding a longest increasing sequence in an array. In “Russian Doll Envelopes”, it’s about finding the maximum number of envelopes that can be put one into another, which is essentially finding a longest increasing sequence in the envelope dimensions.

The main difference is in the details of the problems and the added complexity in “Russian Doll Envelopes”. In “Russian Doll Envelopes”, the sequence is based on two dimensions (width and height of the envelopes) and envelopes can only be put into each other if both dimensions of one envelope are strictly smaller than the other, so it requires more complex comparisons and sorting.

“Russian Doll Envelopes” can be considered more complex than “Longest Increasing Subsequence”, as it is a 2D version of the problem, requiring sorting and comparison in two dimensions rather than just one.

“Longest Increasing Subsequence” is simpler as it involves a single dimension (array of numbers) and requires only one type of comparison (whether one number is larger than the other). “Longest Increasing Subsequence” is simpler one, followed by “Russian Doll Envelopes”.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Solution:                        
    def lengthOfLIS(self, nums: list[int]) -> int:
        arr = [nums.pop(0)]                  
        for n in nums:                       
            if n > arr[-1]:                  
                arr.append(n)
            else:                            
                arr[bisect_left(arr, n)] = n 

        return len(arr)

10 Prerequisite LeetCode Problems

Here are ten problems to build up to the “300. Longest Increasing Subsequence” problem:

  1. 344. Reverse String: This is a basic problem that will get you accustomed to manipulating sequences of data.

  2. 27. Remove Element: This problem has you rearranging elements in a sequence, which is a fundamental concept for sequence-based problems.

  3. 167. Two Sum II - Input array is sorted: This problem also involves sequence manipulation but adds a bit of complexity by having you find two numbers that add up to a target.

  4. 283. Move Zeroes: This problem introduces you to the concept of maintaining the relative order of elements, which is crucial in “Longest Increasing Subsequence”.

  5. 26. Remove Duplicates from Sorted Array: A step further in maintaining the relative order of elements, also it emphasizes on in-place modifications which is a good practice for handling sequence or array-like data.

  6. 88. Merge Sorted Array: This problem involves merging two sorted sequences, which will help you understand the concept of sorted sequences better.

  7. 121. Best Time to Buy and Sell Stock: This problem involves finding a subsequence (a buying day followed by a selling day) with a certain property (maximizes profit), which is conceptually similar to finding a longest increasing subsequence.

  8. 53. Maximum Subarray: While this problem isn’t about subsequences per se, it involves finding a subarray (which is similar to a subsequence) that has the largest sum, which will help you get comfortable with problems about optimizing some property of a subarray or subsequence.

  9. 217. Contains Duplicate: This problem involves scanning through a sequence to find whether there’s a duplicate. The technique of scanning a sequence and keeping track of what you’ve seen will be helpful for “Longest Increasing Subsequence”.

  10. 746. Min Cost Climbing Stairs: This is a simpler dynamic programming problem that will help you get started with the concept of dynamic programming, which is necessary to solve “Longest Increasing Subsequence” efficiently.

Problem Analysis and Key Insights

  1. Subsequence not Subarray: The problem asks for the longest increasing subsequence, not subarray. This is an important distinction because a subsequence doesn’t require the elements to be contiguous in the array, unlike a subarray. This means we can skip over some elements in the array to form an increasing sequence.

  2. Order of Elements Matters: Since we’re asked to find an increasing subsequence, the order of elements in the input array matters. We can’t rearrange the array before finding the subsequence.

  3. Dynamic Programming: The problem has an optimal substructure as the longest increasing subsequence ending at any position i can be calculated based on the calculated longest increasing subsequences of all positions before i. This gives us a hint that we can use dynamic programming to solve this problem.

  4. Binary Search for Optimization: The follow-up question suggests aiming for a time complexity of O(n log n), which implies that an efficient searching mechanism such as binary search might be applicable to optimize the solution further. This might lead us to consider using a data structure such as a list or a tree that is suitable for binary search.

  5. Edge Cases: Since the integers in the array can be negative, and there’s no lower bound on how many times an integer can appear in the array (e.g., the array could contain the same integer multiple times), edge cases related to these should be considered during the implementation.

These insights form the basis of the approach we’re going to use to solve the problem. They help us understand the nature of the problem, the possible strategies we can use, and the potential pitfalls or edge cases we should be aware of.

Problem Boundary

The scope of the “Longest Increasing Subsequence” problem can be considered from several angles:

  1. Input Size: The problem constraints indicate that the input array can have up to 2500 elements, which is a relatively large size. This means that the algorithm needs to handle such large inputs efficiently, which rules out solutions with exponential time complexity.

  2. Data Types: The elements of the input array are integers. They can be positive, negative, or zero, which adds to the complexity of the problem because the sequence can include a mix of negative and positive numbers.

  3. Output: The expected output is a single integer: the length of the longest increasing subsequence. Note that the actual subsequence is not required, only its length.

  4. Problem Domain: The problem belongs to the domain of array manipulation and dynamic programming. Knowledge and understanding of these areas are required to develop an efficient solution.

  5. Algorithmic Complexity: The follow-up question suggests that we should aim for a solution with a time complexity of O(n log n), which is a strong indication that the solution involves more advanced algorithmic concepts, such as binary search in addition to dynamic programming.

So, the scope of this problem involves developing an understanding of array manipulation, dynamic programming, handling large inputs efficiently, and possibly using advanced algorithmic techniques such as binary search.

The boundary of the “Longest Increasing Subsequence” problem is defined by its constraints and requirements, which are:

  1. Input: The input is an array of integers nums where the length of the array (nums.length) is between 1 and 2500, inclusive. Each integer in the array is within the range of -10^4 to 10^4, inclusive. This is the data that your solution will work with.

  2. Output: The output is a single integer that represents the length of the longest strictly increasing subsequence in the nums array. It’s important to note that a subsequence is not required to be contiguous or unique.

  3. Functionality: Your solution must accurately determine the length of the longest increasing subsequence, given the array nums.

  4. Performance: The follow-up question suggests that your solution should have a time complexity of O(n log n) or better. This means your solution needs to be efficient enough to handle the maximum input size within a reasonable time frame.

  5. Side-Effects: The problem does not specify any side-effects, so we assume there should be none. That is, your solution should not modify the input array or change any global or static variables.

  6. Error Handling: The problem does not specify how to handle errors or invalid input. Given the constraints of the problem, we can assume that the input will always be valid.

These are the boundaries within which you need to work when solving this problem. Any solution that stays within these boundaries and correctly solves the problem should be acceptable.

Problem Classification

Domain Classification: This problem falls into the category of dynamic programming (DP) and sequence analysis in computer science. Dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems, solving each of those subproblems just once, and storing their solutions to avoid duplicate work. In this case, the subproblems involve finding the longest increasing subsequences within different sections of the array.

Problem Analysis: The task here is to find the length of the longest strictly increasing subsequence in the given array of integers.

‘What’ Components:

  1. Input: The input is an array of integers nums which can have a length from 1 to 2500. Each integer in the array is between -10^4 and 10^4.

  2. Output: The output is an integer that represents the length of the longest strictly increasing subsequence within the input array.

  3. Objective: The main objective is to find and return the length of the longest strictly increasing subsequence in the given array of integers. A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

Based on the ‘What’ components, the problem can be further classified into the optimization category, as the task is to optimize (maximize, in this case) the length of a subsequence that meets certain conditions (strictly increasing order). The challenge is to achieve this optimization efficiently, ideally in O(n log n) time complexity.

Constraint Analysis: The constraints mainly focus on the length of the array and the range of the integers. This gives us an idea about the expected performance of the solution. The constraints suggest that the solution needs to handle quite large arrays and thus, needs to be optimized to run within a reasonable time limit. The follow-up question suggests the idea of achieving a solution with a time complexity of O(n log n), which indicates the use of efficient searching algorithm or data structure, like binary search or balanced binary search trees.

Distilling the Problem to Its Core Elements

  1. Fundamental Concept: The Longest Increasing Subsequence (LIS) problem is primarily based upon the concept of Dynamic Programming (DP). DP is a problem-solving paradigm that solves problems by combining the solutions to subproblems. For this problem, the idea is to use previously calculated lengths of increasing subsequences to determine the longest one. It also involves some understanding of sequence analysis, which is fundamental to many fields including computer science and bioinformatics.

  2. Simplest Description: Imagine you have a list of random numbers. You need to find the longest chain of numbers in the list such that every number is strictly larger than the previous one. However, these numbers don’t have to be right next to each other in the list.

  3. Core Problem: The core problem here is determining the longest sequence of numbers in a given list where each number is strictly larger than the one before it. Simplified, we’re trying to find the longest chain of increasing numbers in a list.

  4. Key Components:

    • Understanding the concept of a subsequence in a list of numbers.
    • Identifying and applying the concept of dynamic programming to solve the problem.
    • Using previous calculations of subproblem solutions to solve larger problems.
  5. Minimal Set of Operations:

    • Initialization of a DP table/array to store the maximum length of the subsequence for each position.
    • Iteration through the list, for each element iterating through all previous elements.
    • Comparison of the current element with previous elements to check if current element can extend the increasing subsequence formed by the previous elements.
    • Update of DP table/array, storing the maximum length of the increasing subsequence at each step.
    • Finally, return the maximum length from the DP table, which represents the length of the longest increasing subsequence.

Visual Model of the Problem

Visualizing this problem can be quite useful to understand it better. One common approach is to draw an array representing the given list of numbers, and an additional array representing the maximum length of the increasing subsequence at each position.

Let’s take an example with the array [10, 9, 2, 5, 3, 7, 101, 18]. We initialize another array of the same size with all elements as 1 because a single number itself forms an increasing subsequence of length 1.

We start from the second element in the array and move right. For each number, we look at all the numbers to its left. If the current number is greater than the left number, that means it can form an increasing subsequence. We compare the lengths of the subsequences and keep the longest.

Here’s how we can visualize this process:

109253710118
11111111

In the end, the second row becomes:

| 1 | 1 | 1 | 2 | 2 | 3 | 4 | 4 |

This indicates that the length of the longest increasing subsequence is 4.

You can also visualize this process with a graph or a plot where the x-axis represents the indices of the original array and the y-axis represents the numbers at those indices. This will give you a graphical representation of the subsequences. For the array above, you could see a subsequence [2, 5, 7, 101] forming a strictly increasing curve on the plot.

Problem Restatement

We are given an array of integers. Our goal is to find the length of the longest sequence within this array where the numbers strictly increase from one element to the next. In other words, for any two consecutive numbers in this sequence, the second number must always be greater than the first.

For instance, if our input array was [10,9,2,5,3,7,101,18], the longest increasing subsequence would be [2,5,7,101] which has a length of 4.

Note that these sequences don’t have to be contiguous or consecutive in the original array. They just need to maintain the same relative order. So, for example, in the sequence [2,5,7,101], even though 5 and 7 aren’t next to each other in the original array, they are part of the same sequence because 5 comes before 7 in the array.

The problem comes with a few constraints:

  • The size of the input array will be between 1 and 2500.
  • Each number in the array can range from -10^4 to 10^4.

As a challenge, the problem encourages us to come up with a solution that works in O(n log(n)) time, where n is the number of elements in the array. However, this isn’t a strict requirement, so solutions with different time complexities could also work.

Abstract Representation of the Problem

We can indeed describe this problem in abstract terms:

We’re given a sequence S of n elements from a totally ordered set. A totally ordered set means that for any two elements, one is greater than, less than, or equal to the other. Our task is to find the longest subsequence of S where every element is strictly larger than its preceding element.

Key elements of the abstract representation are:

  1. Sequence S: This is the input array of integers.

  2. Totally ordered set: This is the mathematical term for the concept that every pair of elements in the set can be compared. In the context of this problem, our totally ordered set is the set of all integers.

  3. Subsequence: This is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

  4. Strictly larger: An element is strictly larger than another element if it is greater than and not equal to the other element.

Remember, the elements in the subsequence do not have to be contiguous in the original sequence S. They just have to be in the same relative order.

The output of this abstract problem is a single integer, which is the length of the longest strictly increasing subsequence.

The constraint on the time complexity, O(n log(n)), can also be considered an abstract part of the problem, as it defines a limit on the computational resources available for solving the problem.

Terminology

Here are a few key terms and concepts:

  1. Subsequence: A subsequence is a sequence derived from another sequence by deleting some (or no) elements without changing the order of the remaining elements. It’s a key concept because our problem involves finding a particular type of subsequence.

  2. Increasing subsequence: This is a type of subsequence where each element is larger than the previous one. It’s central to this problem because we’re tasked with finding the longest increasing subsequence.

  3. Dynamic Programming: Dynamic Programming is a problem-solving approach where complex problems are broken down into simpler subproblems, and solutions to these subproblems are stored so they don’t need to be recomputed. This method is often used in problems like this one, especially because the subproblems overlap.

  4. Time complexity: Time complexity describes the computational complexity of an algorithm, or how the time to complete the algorithm scales with the size of the input. In this problem, there’s a constraint that the solution should have a time complexity of O(n log(n)).

  5. Binary search: Binary search is an algorithm used to find a particular element in a sorted list. It halves the search space at each step by comparing the target with the middle element. The relevance of binary search to this problem comes from its role in optimizing the dynamic programming solution to achieve the desired O(n log(n)) time complexity.

  6. Array: An array is a fundamental data structure in computer science that consists of a collection of elements (values or variables), each identified by an array index. In this problem, we are given an array of integers.

Understanding these concepts is crucial for formulating and implementing a solution to the problem.

Problem Simplification and Explanation

In essence, the problem is asking us to identify the longest series of numbers in the given list where each number is larger than the previous one. To simplify this further, imagine the numbers in the list are a series of steps of various heights and in no particular order. Our task is to find the longest sequence of steps where each step is higher than the last, but we’re allowed to skip steps.

So for example, given the steps (numbers) [10,9,2,5,3,7,101,18], one possible sequence is [10, 18], but that’s not the longest one. You could also do [2, 3, 7, 101], which gives you 4 steps, and that’s the longest sequence you can get where each step is higher than the last.

A few important concepts come into play here:

  1. Sequences and Subsequences: As mentioned, a subsequence is any subset of numbers from the original list that maintains the original order of the numbers. In our steps analogy, a subsequence is a series of steps we can traverse while maintaining their original positions.

  2. Increasing Order: A sequence (or subsequence) is in increasing order if each number (or step) is larger (higher) than the previous one. In our problem, we’re specifically interested in subsequences where each step is higher than the previous one.

  3. Longest Increasing Subsequence (LIS): This is simply the longest subsequence that is in increasing order. In terms of our steps analogy, the LIS is the longest path we can take where each step is higher than the previous one.

Understanding these key concepts and how they interact forms the basis for approaching a solution to this problem.

In the context of an array or a string, a sequence is a contiguous subset of elements, meaning it does not skip elements between its start and end. On the other hand, a subsequence is not required to be contiguous. It can include elements that are scattered throughout the array or string, skipping some elements in between.

Let’s consider an example to illustrate this. Given an array [1, 2, 3, 4, 5], here are some examples of sequences and subsequences:

  • Sequences: [1, 2], [3, 4, 5], [2, 3, 4] (all are contiguous subsets)
  • Subsequences: [1, 3, 5], [2, 4], [1, 2, 5] (not necessarily contiguous)

So, in the context of the problem of finding the Longest Increasing Subsequence (LIS), we are looking for the longest subset of the array that is in increasing order and not necessarily contiguous.

Constraints

Let’s identify certain characteristics or conditions from the problem statement of the Longest Increasing Subsequence that might help us devise an efficient solution.

  1. Increasing Subsequence: The problem asks for an “increasing” subsequence. This means that we’re looking for a set of numbers where each number is larger than the one before it. This property can be exploited to design an algorithm because once we know a subsequence is increasing, we only need to find a number larger than the last number in the subsequence to extend it.

  2. Length of the Subsequence: The problem asks for the “length” of the longest increasing subsequence, not the subsequence itself. This can simplify our task, because we only need to keep track of the length of the longest subsequence we’ve found so far, rather than remembering all the elements in that subsequence.

  3. Array Length Constraint: The array length is constrained between 1 and 2500. This constraint is not small enough for us to use a brute-force solution, but it does hint that a solution with a time complexity of O(n^2) or O(n log n) might be feasible.

  4. Range of Elements: The elements in the array are constrained within a specific range (-10^4 to 10^4). While this may not directly affect our algorithm design, it’s something to keep in mind when considering possible solutions, as it could potentially be useful in certain situations.

Looking at these characteristics, an effective strategy might be to use dynamic programming, which is a method for solving problems by breaking them down into smaller subproblems and reusing solutions to these subproblems to build up solutions to larger ones. This seems fitting here, because the problem of finding the longest increasing subsequence at any point in the array depends on the solutions to the same problem for the earlier points in the array. This realization can significantly reduce the time complexity of our solution.

Analyzing the constraints of a problem gives us valuable insights into the problem’s nature and the kind of solution approaches that could be feasible. In the case of the Longest Increasing Subsequence problem, the constraints guide us in a few key ways:

  1. Array Length: The constraint of the array length being between 1 and 2500 indicates that the problem might not be efficiently solvable with a brute force approach, which would generally have a time complexity of O(n!). An efficient solution to the problem will likely need to have a time complexity of O(n^2) or O(n log n) to be considered effective for the given constraints.

  2. Array Elements Range: The array elements are within the range of -10^4 to 10^4. This is an important insight because if the range of elements was smaller, we could have used counting sort or other similar strategies to our advantage. However, given the fairly large range, such strategies may not provide an optimal solution.

  3. Length of the Subsequence: The problem asks for the length of the longest increasing subsequence, not the sequence itself. This constraint simplifies our problem because we don’t need to store the entire sequence, only the length. This impacts our space complexity, which could be important if we are dealing with large inputs.

From these insights, it seems that a dynamic programming approach might be most suitable. In this context, each subproblem would be to find the longest increasing subsequence ending at each position in the array. This is guided by the realization that the solution at any position in the array depends on the solutions at previous positions, and dynamic programming excels in such scenarios by storing and reusing these subproblem solutions.

Case Analysis

Here are some additional test cases that highlight different aspects of the problem:

  1. Single Element (Edge Case):

    • Input: [5]
    • Output: 1
    • Explanation: The array contains only one element. Therefore, the longest increasing subsequence is that single element itself, so the length is 1.
  2. All Elements Same (Edge Case):

    • Input: [7,7,7,7,7,7,7]
    • Output: 1
    • Explanation: In this case, all the elements in the array are the same. The longest strictly increasing subsequence will be a single element, as including more than one element of the same value won’t make the subsequence strictly increasing.
  3. All Elements in Increasing Order (Ideal Case):

    • Input: [1,2,3,4,5,6,7,8,9]
    • Output: 9
    • Explanation: The entire array is already in strictly increasing order. Therefore, the longest increasing subsequence is the entire array itself.
  4. All Elements in Decreasing Order (Worst Case):

    • Input: [9,8,7,6,5,4,3,2,1]
    • Output: 1
    • Explanation: The array is in strictly decreasing order. Therefore, the longest strictly increasing subsequence can only be one element.
  5. Mixed Order Elements (General Case):

    • Input: [10,9,2,5,3,7,101,18]
    • Output: 4
    • Explanation: The longest increasing subsequence is [2,3,7,101], therefore the length is 4.
  6. Negative Numbers Included:

    • Input: [-1,3,4,5,2,2,2,2]
    • Output: 4
    • Explanation: The longest increasing subsequence here is [-1,3,4,5]. Even though ‘2’ appears 4 times, it doesn’t contribute to a strictly increasing sequence.

By analyzing these test cases, we can see that the problem involves finding a longest increasing subsequence and not necessarily the longest subarray. The increasing subsequence does not need to be contiguous and can skip elements. The sequence must be strictly increasing, meaning equal values do not count towards increasing the length. These insights are critical in developing a correct and efficient solution to the problem.

Analyzing the different cases helps us to understand the constraints of the problem and the key strategies required to solve it. Here are the key insights we can derive:

  1. Nature of Longest Increasing Subsequence (LIS): The LIS does not need to be a contiguous subarray. It can skip elements in between.

  2. Single-element and All-same-element Cases: Even if the array contains a single element or all elements in the array are the same, the longest increasing subsequence is of length 1. This shows that the subsequence needs to be strictly increasing.

  3. All Increasing or All Decreasing Case: When all elements are in increasing order, the entire array is the LIS. In contrast, when all elements are in decreasing order, since no two elements form an increasing sequence, the LIS length is 1. This gives us a hint about the importance of the relative order of elements.

  4. Mixed Order Elements: In general, when the elements are in a mixed order, we need to find the longest sequence of numbers where each number is larger than the previous one. This shows the necessity of a dynamic approach, as the solution depends on the solutions of smaller subproblems (subarrays).

  5. Presence of Negative Numbers or Repeating Numbers: The presence of negative numbers or repeating numbers doesn’t change the core problem. We still need to find the longest strictly increasing subsequence. However, we need to handle these cases properly in our code.

  6. Array Length and Element Size: The constraints of the problem (1 <= nums.length <= 2500 and -10^4 <= nums[i] <= 10^4) indicate that the solution needs to handle a relatively large input size efficiently. This hints that an algorithm with better than O(n^2) time complexity may be needed to solve this problem within a reasonable amount of time.

These insights help shape our understanding of the problem and guide us towards formulating an efficient algorithm to solve it.

The minimum number of test cases to analyze and generate insights for the problem of finding the longest increasing subsequence would be five, as they cover the variety of scenarios we can encounter in this problem:

  1. Single-Element Case: A single element in the array (e.g., [5]). This tests the edge case where the length of the array is at its lower limit.

  2. All Increasing Elements: The array has all elements in increasing order (e.g., [1, 2, 3, 4, 5]). This tests the scenario where the longest increasing subsequence is the entire array.

  3. All Decreasing Elements: The array has all elements in decreasing order (e.g., [5, 4, 3, 2, 1]). This tests the scenario where the longest increasing subsequence length is 1, no matter how large the array is.

  4. Mixed Order Elements: The elements are neither all increasing nor all decreasing (e.g., [10, 9, 2, 5, 3, 7, 101, 18]). This tests the scenario where we need to find the longest increasing subsequence among mixed numbers.

  5. Repeated or Negative Numbers: The array contains repeated or negative numbers (e.g., [1, 2, 2, 3, 3, 3, 4, 4, 4, 4] or [-1, -2, 2, 3, 4]). This tests the handling of repeated or negative numbers, which could affect the length of the longest increasing subsequence.

These five cases should provide a good understanding of the problem and allow us to generate insights about the patterns and the ways to solve it. Of course, testing more cases with different conditions and constraints can provide a better understanding and help ensure the solution is robust and comprehensive.

Identification of Applicable Theoretical Concepts

The problem of finding the longest increasing subsequence in an array can be simplified using the following mathematical and algorithmic concepts:

  1. Dynamic Programming (DP): DP is a method for solving complex problems by breaking them down into simpler subproblems. It is particularly useful when the subproblems overlap, meaning the same subproblem is solved multiple times. The solution to the larger problem is then constructed from the solutions to the subproblems. For the longest increasing subsequence problem, the DP approach would involve breaking down the problem into subproblems that find the longest increasing subsequence ending at each position. The main concept applied here is “optimal substructure”, which means the solution to a problem can be constructed from the solutions to its subproblems.

  2. Binary Search: This algorithm is typically used to find a specific value in a sorted sequence in logarithmic time complexity. In this context, binary search can be used to optimize the DP approach, resulting in a solution with time complexity O(n log n). This is done by maintaining an active list of potential increasing subsequences and using binary search to find the right place to extend or update these subsequences.

  3. Patience Sorting: This is a card game-based algorithm that provides another perspective for solving this problem. The game is played with a deck of cards; the player flips over cards one by one and places each one on one of several piles. Each pile is ordered, with big cards on top of smaller ones, and a card is placed on the leftmost pile that’s still higher. This process is analogous to constructing potential increasing subsequences and gives another view on the binary search method.

Remember, identifying and understanding these concepts is a key part of the problem-solving process. They provide a structured way of approaching the problem and often lead to more efficient and effective solutions.

Simple Explanation

Imagine you’re a teacher and you have a line of students in your class. Each student has a number on their shirt. Now, the students are not arranged in any specific order - they could be standing randomly.

Your task is to find the longest line of students you can form such that the numbers on their shirts go up as you move along the line. But here’s the thing, you can’t change the order of the students. You can only pick the students that are already in increasing order of the numbers on their shirts.

For example, if the numbers on the shirts of the students in the line are [10, 9, 2, 5, 3, 7, 101, 18], the longest line of students you can form with increasing numbers is the line with the students wearing the shirts numbered [2, 3, 7, 101]. There are 4 students in this line.

Even though the students are not standing next to each other in the original line, the numbers on their shirts are in increasing order and that’s what you’re looking for.

Your challenge, as a teacher, is to figure out the longest such line you can form given any random line of students.

In terms of our original problem, the students are the elements of the array, and the numbers on their shirts are the values of these elements. The “longest line” is the longest increasing subsequence in the array.

Problem Breakdown and Solution Methodology

Let’s break down the process of solving this problem step-by-step. For this, we’ll use the analogy of playing a card game.

  1. Initial Setup: Imagine you’re playing a card game, where you need to create the longest stack of cards with the rule that each subsequent card you place on the stack must have a higher number than the previous one. The cards are dealt one by one, and you can choose either to include the card in your existing stacks or start a new stack. Here, the stacks represent subsequences, and the cards represent the numbers in the array.

  2. Process Each Element: You go through the array (or the deck of cards) one element at a time. For each number, you have two choices - either add it to an existing increasing subsequence (stack), or start a new subsequence (stack) with it. The goal is to make the stacks as long as possible.

  3. Decide Which Stack to Add to: If the current number is greater than the last number of an existing subsequence (top card of some stack), you add it to that subsequence. If not, you start a new subsequence with the current number. In the card game, you add the card to the stack where the top card has the number just less than the current card. If there’s no such stack, you start a new stack.

  4. Keep Track of Longest Subsequence: As you go through the array, you keep track of the length of the longest subsequence you’ve found so far (the tallest stack of cards you’ve built).

  5. Adjusting Parameters: If the numbers in the array were to change, this would change which subsequences you can form and therefore potentially change the length of the longest increasing subsequence. Similarly, if the order of the numbers in the array changes, this could also affect the length of the longest increasing subsequence.

Let’s walk through an example with the array [10, 9, 2, 5, 3, 7, 101, 18]. We’ll use the approach above to find the longest increasing subsequence:

  • Start with 10. As it’s the first number, start a new subsequence [10].
  • Next is 9. It doesn’t extend any existing subsequence, so start a new subsequence [9].
  • Next is 2. It also starts a new subsequence, [2].
  • 5 can be added to subsequence [2] to create [2, 5].
  • 3 starts a new subsequence [3].
  • 7 can be added to [2, 5] to create [2, 5, 7].
  • 101 can be added to [2, 5, 7] to create [2, 5, 7, 101].
  • Finally, 18 can be added to [2, 5, 7] to create [2, 5, 7, 18], but this is not as long as [2, 5, 7, 101].

So, the longest increasing subsequence is [2, 5, 7, 101] with length 4.

This step-by-step process would also work with different arrays and is efficient enough to handle the constraints of the problem. However, efficiently deciding which stack to add to (step 3) is a bit challenging and may require additional data structures or techniques. The desired time complexity of O(n log(n)) hints at using binary search or similar approaches for this.

Inference of Problem-Solving Approach from the Problem Statement

Let’s break down the key concepts involved in solving this problem:

  1. Longest Increasing Subsequence (LIS): The LIS is a common dynamic programming problem. Recognizing this problem usually involves looking for keywords such as ’longest’, ‘increasing’ or ‘subsequence’. In this context, a subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

  2. Dynamic Programming: The optimal solution for this problem involves dynamic programming, a method for solving problems by breaking them down into smaller subproblems and using the results of those subproblems to build up the answers to larger ones. This is suggested by the need to keep track of multiple possible increasing subsequences.

  3. Binary Search: The optimal solution to this problem requires a time complexity of O(n log(n)), which is often achieved using binary search. Recognizing that binary search can be used here is crucial for reaching an efficient solution. Binary search involves repeatedly dividing the search space in half, which aligns with the log(n) part of the time complexity.

  4. Subproblem Overlap: One key insight here is recognizing the overlapping subproblems - smaller instances of the problem that need to be solved multiple times. This is a characteristic of many dynamic programming problems. In this case, the subproblems are the longest increasing subsequences ending at each position in the array.

As for visualizing these properties, a table or a 1D array can be used to keep track of the longest increasing subsequence ending at each position in the array. Each entry in the table/array represents a subproblem, and its value represents the solution to that subproblem. For instance, you could have a table as follows:

Array index01234567
Array value109253710118
LIS length11122344

This table represents the array [10, 9, 2, 5, 3, 7, 101, 18] and the length of the longest increasing subsequence ending at each position. The maximum value in the ‘LIS length’ row represents the length of the longest increasing subsequence in the entire array.

The Longest Increasing Subsequence (LIS) problem is a classic problem that can be solved using Dynamic Programming (DP). Here’s how you could infer that from the problem statement:

  1. Optimal Substructure: Dynamic programming is often applicable when a problem has an optimal substructure, meaning the optimal solution to a problem can be constructed from the optimal solutions of its subproblems. In the LIS problem, if you know the longest increasing subsequence up to a certain point, you can use that information to find the longest increasing subsequence up to the next point. This characteristic suggests that dynamic programming could be a useful approach.

  2. Overlapping Subproblems: Another characteristic of problems that can be solved with dynamic programming is overlapping subproblems. This means solving the problem involves solving the same subproblems multiple times. In the case of LIS, to find the longest increasing subsequence ending at each position, you have to solve the subproblem of finding the longest increasing subsequence ending at each previous position. This overlap in subproblems is another hint that dynamic programming could be an effective approach.

  3. Finding the Length of an Optimal Solution: The problem is asking for the length of the longest increasing subsequence, not the subsequence itself. This is another characteristic of many dynamic programming problems, where the goal is to find the “length” or “number” of optimal solutions, rather than the actual solutions.

Remember, recognizing that a problem can be solved using dynamic programming often comes with practice and experience. As you solve more problems and gain familiarity with common problem patterns, it becomes easier to identify when dynamic programming is a good fit for a problem.

Simple Explanation of the Proof

The algorithm for solving the Longest Increasing Subsequence problem works on the principle of dynamic programming, which involves breaking down a complex problem into simpler subproblems and solving each of them only once, storing their solutions in case we need them again.

The main idea behind the proof of correctness for this algorithm is the principle of optimality, which states that an optimal strategy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal strategy with regard to the state resulting from the first decision. In other words, every subsequence of an optimal sequence is optimal.

Here’s a simplified explanation:

  1. Initialization: We start with an array dp of the same length as the input array nums, and initialize all elements as 1. This is because, at the very least, each number forms an increasing subsequence of length 1 with itself.

  2. Filling up the dp array: We then iteratively fill up the dp array by following these steps for every i from 0 to n-1:

    • For every j less than i, we check if nums[i] > nums[j]. If this condition is met, it means we have found a potential subsequence where the number at i can be appended to the subsequence ending at j.

    • Now, we want to maximize the length of the subsequence ending at i, so we choose the maximum length from all such subsequences. Hence, we update dp[i] = max(dp[i], dp[j] + 1). The +1 corresponds to including nums[i] in the subsequence.

  3. Final answer: Once we’ve filled up our dp array, the answer to our problem is the maximum value in this array, which represents the longest increasing subsequence.

In essence, we’re building up solutions to larger and larger subsequences using the solutions we’ve calculated for smaller subsequences. We can trust these solutions because of the principle of optimality – if we’ve correctly calculated the longest subsequence up to j, then we can use that solution to correctly calculate the longest subsequence up to i.

Stepwise Refinement

  1. Stepwise Refinement of the Approach:
  • Step 1: Initialize an auxiliary array dp of the same size as the input array nums with all elements set to 1. Each element of dp will represent the length of the longest increasing subsequence ending at that index in the nums array.

  • Step 2: For every index i from 1 to n-1 (where n is the length of the nums array), iterate over all the previous indices j from 0 to i-1.

  • Step 3: For every j < i, check if nums[i] > nums[j]. If this is true, it means that the number at index i can be appended to the increasing subsequence ending at index j. In this case, update dp[i] to be the maximum of dp[i] and dp[j] + 1.

  • Step 4: Once the dp array is completely filled, the maximum value in the dp array represents the length of the longest increasing subsequence in the nums array. Traverse the dp array to find this maximum value.

  1. Granular, Actionable Steps:
  • Initialize dp as a list of 1s of size n.
  • For i in range(1, n), do:
    • For j in range(i), do:
      • If nums[i] > nums[j], set dp[i] = max(dp[i], dp[j] + 1).
  • Return max(dp).
  1. Independent Subproblems:
  • The length of the longest increasing subsequence ending at each index in the nums array can be computed independently.
  1. Repeatable Patterns:
  • The iterative process of checking each pair of indices (i, j) where j < i and updating the dp[i] value if nums[i] > nums[j] is a pattern that repeats for each index in the nums array.

  • The operation to calculate the maximum length of the increasing subsequence by scanning the dp array is also a repetitive process.

Solution Approach and Analysis

Let’s break this problem down into an easy-to-understand, step-by-step approach.

Suppose you are a teacher and you have a stack of students’ papers, each with different scores written on them. You want to create the longest possible list of scores where each score is strictly higher than the previous one, but the order of the papers cannot be changed. This is essentially the problem at hand.

Here’s the step-by-step solution:

  1. Step 1 (Preparation): Initialize an array, say dp, of the same size as the input array nums and fill it with 1. This is like creating a stack for each student, and each stack initially has one paper.

  2. Step 2 (Finding the potential stacks): We will iterate through each paper (or number in our nums array) from left to right (or from the first to the last element), and for each paper, we will check every previous paper to see if we can put the current paper on top of the stack of the previous paper. Specifically, we can put the current paper on top of the stack of a previous paper if its score is strictly higher (in terms of our problem, if nums[i] > nums[j]).

  3. Step 3 (Updating the stacks): If we can put the current paper on top of the stack of a previous paper, we do so if it gives us a higher stack than our current stack. This means we update dp[i] to be the maximum of dp[i] and dp[j] + 1.

  4. Step 4 (Finding the tallest stack): Once we have iterated over all papers and updated all possible stacks, we simply look for the tallest stack. The height of the tallest stack is the solution to our problem. So, we just need to return the maximum value in the dp array.

Now, let’s see how our solution behaves with different inputs:

  • Example 1: nums = [10,9,2,5,3,7,101,18]

    Here, the longest increasing subsequence is [2,3,7,101], which has a length of 4.

  • Example 2: nums = [0,1,0,3,2,3]

    Here, the longest increasing subsequence is [0,1,2,3], which has a length of 4.

  • Example 3: nums = [7,7,7,7,7,7,7]

    Here, all the elements are the same, so the longest increasing subsequence is just any one element, [7], which has a length of 1.

Identify Invariant

In the context of the Longest Increasing Subsequence problem, an important invariant is the property of the dynamic programming (DP) array dp we use to solve the problem.

At any given index i in the dp array, the value dp[i] represents the length of the longest increasing subsequence (LIS) that ends at index i in the input array nums. This property remains invariant throughout the execution of the algorithm.

This invariant is crucial to the dynamic programming approach. The reason is that it allows us to build up the solution iteratively by leveraging previously computed results. Specifically, for every index i, we compute dp[i] by looking at all previous indices j (where j < i), and checking if we can extend the longest increasing subsequence ending at j by including the element at index i (if nums[i] > nums[j]). If we can, we update dp[i] accordingly (dp[i] = max(dp[i], dp[j] + 1)).

Hence, the invariant here is that dp[i] always represents the length of the longest increasing subsequence ending at i in the input array, irrespective of the values at other indices. This holds throughout the execution of the algorithm and allows us to build the DP array iteratively and find the overall longest increasing subsequence.

Identify Loop Invariant

A loop invariant for this problem would be the following condition that holds true before and after each iteration of our main loop:

“For any index i within our current sub-array of nums (the portion of the nums array that we’ve processed so far), dp[i] contains the length of the longest increasing subsequence (LIS) that ends with nums[i].”

This loop invariant is a critical property that allows our dynamic programming solution to function correctly. It ensures that at each step of our loop, we have the correct lengths of the longest increasing subsequences ending at each index within the sub-array we’ve processed so far. These values are what we use to determine the dp value for the next index.

We can say this condition is maintained because in each loop iteration, we are essentially saying:

“Let’s consider the next number in our array (nums[i]). For every index j that comes before i (i.e., j < i), if nums[j] < nums[i], it means we can extend the increasing subsequence that ends at j to include nums[i]. So we compare the length of this new subsequence (dp[j]+1) to the longest increasing subsequence that we’ve found so far that ends at nums[i] (dp[i]), and update dp[i] if the new subsequence is longer.”

So before the start of the next iteration (or after the current iteration), our loop invariant still holds: for any index i within our current sub-array of nums, dp[i] contains the length of the longest increasing subsequence that ends with nums[i]. This continues until we’ve processed the entire array.

Thought Process

This problem is a classic example of Dynamic Programming (DP), a problem-solving paradigm that solves complex problems by breaking them down into simpler subproblems and storing the results of these subproblems to avoid redundant computation.

Here’s the step-by-step process to approach this problem:

  1. Understand the problem: You’re given an array of integers and you need to find the length of the longest increasing subsequence in this array. A subsequence is a sequence that can be derived from an array by deleting some or no elements without changing the order of the remaining elements.

  2. Identify the problem type: This is a type of optimization problem (finding the longest, maximum, etc.) which suggests that it could potentially be solved using dynamic programming. In particular, the problem involves finding an optimal substructure within the given array (i.e., the longest increasing subsequence), which is a key hallmark of problems that can be solved with dynamic programming.

  3. Formulate a DP approach: Dynamic Programming approach works by solving the problem in a bottom-up manner. We’ll create a DP table (usually an array) where the i-th entry represents the solution to the subproblem consisting of the first i elements of the array. In this case, the i-th entry in the DP table will represent the length of the longest increasing subsequence ending at the i-th element.

  4. Initialize the DP table: We start by initializing our DP table with all 1s. This is because a single element itself forms a valid (though trivial) increasing subsequence.

  5. Fill the DP table: We then iterate over our DP table, and for each i, we consider all elements before i. If any element before i is less than the i-th element, that means we can extend the increasing subsequence ending at that element to include the i-th element. We keep track of the longest such subsequence.

  6. Extract the solution: The length of the longest increasing subsequence is then the maximum value in our DP table.

Here’s how you might implement this approach in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def lengthOfLIS(nums):
    if not nums:
        return 0

    dp = [1]*len(nums)  # initialize DP table with 1s
    for i in range(len(nums)):
        for j in range(i):
            if nums[i] > nums[j]:  # if we can extend the increasing subsequence
                dp[i] = max(dp[i], dp[j]+1)  # update DP table with the longest such subsequence

    return max(dp)  # return the maximum value in the DP table

In this problem statement, the key cues are the requirements to find a “longest increasing subsequence,” which suggests a need for an optimization strategy (finding longest), and a sequence-based strategy (since we’re dealing with subsequences). Together, these suggest a dynamic programming approach.

A significant insight here is to consider not just any subsequence, but specifically to look at increasing subsequences that end at each position in the array. This provides the subproblem structure that we need for our dynamic programming solution.

Establishing Preconditions and Postconditions

Let’s define these elements for the Python function lengthOfLIS(nums) from the example provided:

  1. Parameters:

    • Input to the method: nums
    • Type: List of integers
    • The nums parameter represents the given array of integers for which we want to find the length of the longest increasing subsequence.
  2. Preconditions:

    • Before this method is called, we assume that nums is a valid list of integers.
    • Constraints on the input parameters are that the length of nums is between 1 and 2500, and each element is an integer in the range [-104, 104].
    • No specific state that the program or some part of it must be in.
  3. Method Functionality:

    • The lengthOfLIS method is expected to return the length of the longest strictly increasing subsequence in nums.
    • It does not modify the input list nums. Instead, it creates a dynamic programming (DP) table and uses it to compute the longest increasing subsequence.
  4. Postconditions:

    • After the method has returned, the state of the program or the values of the parameters do not change because the function does not have any side effects.
    • The return value represents the length of the longest increasing subsequence in nums.
    • The method does not have any side effects. It does not modify the input list nums or any other state of the program.
  5. Error Handling:

    • If the precondition is not met, that is, if nums is not a valid list of integers or if it is empty, the Python interpreter will likely throw an exception. For example, if nums is not a list, the method call len(nums) will result in a TypeError. If nums is an empty list, the method call max(dp) will result in a ValueError because max() arg is an empty sequence.
    • The method does not include explicit error handling and does not return a special value in case of an error. In production code, you might want to include error handling to check that the input is a non-empty list of integers.

Problem Decomposition

  1. Problem Understanding:

    • The problem is about finding the longest subsequence in a given list of numbers where each number is larger than the previous one. The key requirements are that the numbers in the subsequence must be in the same order as in the original list, and the subsequence should be as long as possible.
  2. Initial Breakdown:

    • The major parts of the problem are: finding all subsequences of the list, checking whether each subsequence is strictly increasing, and keeping track of the longest increasing subsequence encountered.
  3. Subproblem Refinement:

    • Finding all subsequences can be broken down into going through each number in the list and deciding whether to include it or not.
    • Checking whether a subsequence is strictly increasing can be done by comparing each number with the one before it.
    • Keeping track of the longest increasing subsequence involves maintaining a variable that gets updated whenever a longer increasing subsequence is found.
  4. Task Identification:

    • The tasks that are repeated are going through the numbers in the list, and comparing each number with the one before it. These can be generalized into a single task of iterating over the list and processing each number.
  5. Task Abstraction:

    • The task of iterating over the list and processing each number is abstract enough to be clear and reusable. It involves going through each number, deciding whether to include it in the current subsequence, and updating the longest increasing subsequence if necessary.
  6. Method Naming:

    • We can call the task of going through each number and deciding whether to include it process_number. The task of updating the longest increasing subsequence can be called update_longest_increasing_subsequence.
  7. Subproblem Interactions:

    • The tasks need to be performed in the order described: first process each number in the list, then update the longest increasing subsequence. The latter task depends on the former: you can’t update the longest increasing subsequence without first processing the numbers in the list.

Please note that the above breakdown is a naive approach, and the problem can be solved more efficiently using dynamic programming. In a dynamic programming approach, the major parts of the problem would be different: initializing a list to keep track of the longest increasing subsequence ending at each position, and updating this list as you go through the numbers in the original list.

From Brute Force to Optimal Solution

Brute Force Approach:

  1. Generate all possible subsequences of the given list of numbers. This can be done by going through each number and deciding whether to include it in the current subsequence or not. This would give us 2^n subsequences (n is the length of the list), as for each element, we have 2 choices - either to include it or not.

  2. For each subsequence, check whether it is strictly increasing. This can be done by comparing each number with the one before it.

  3. Keep track of the longest increasing subsequence encountered.

  4. Return the length of the longest increasing subsequence.

This brute force solution has a time complexity of O(2^n * n) because there are 2^n subsequences, and for each subsequence, we are checking in O(n) time whether it is increasing.

The space complexity is O(n), which is the space needed to store the current subsequence.

Optimizing the Solution:

The above approach is highly inefficient due to repeated computations. We are solving the same subproblems again and again (finding the longest increasing subsequence ending at each position). This property of the problem suggests that it can be solved using dynamic programming.

Here’s the dynamic programming solution:

  1. Initialize a list (dp) of the same length as the given list of numbers, where dp[i] is the length of the longest increasing subsequence ending at position i. Initially, set all values in dp to 1, because a single number is itself an increasing subsequence of length 1.

  2. Iterate over the list of numbers. For each number at position i, compare it with all previous numbers. If the current number is greater than a previous number and dp[i] < dp[j] + 1, update dp[i] to dp[j] + 1.

  3. Return the maximum value in dp, which represents the length of the longest increasing subsequence in the list.

The time complexity of this solution is O(n^2), as we are using two loops: one to go through each number and another to compare each number with all previous numbers. This is a significant improvement over the brute force solution, especially for large lists.

The space complexity is also O(n), which is the space needed to store the dp list. This is the same as the brute force solution, but the optimized solution is much faster.

This problem is a classic example of dynamic programming, where we break down a complex problem into simpler subproblems, solve each subproblem only once, and use the solutions to the subproblems to build up the solution to the original problem.

Code Explanation and Design Decisions

  1. Initial Parameters: The function receives an array of numbers nums. Each number in this array is a potential member of an increasing subsequence. Our goal is to find the length of the longest such subsequence.

  2. Primary Loop/Iteration: We iterate through each element in nums using two nested loops. The outer loop represents the end of the current subsequence we are considering, and the inner loop represents potential starting points for this subsequence. The inner loop iterates from the beginning of the array up to the current end point.

  3. Conditions/Branches: Inside the inner loop, we have an if condition checking if nums[j] < nums[i] and if dp[i] < dp[j] + 1. This condition is checking if we can extend the subsequence ending at j by including nums[i] to form a longer increasing subsequence. If both conditions are met, we know that nums[i] is greater than nums[j] and that the length of the increasing subsequence ending at i is less than the length of the increasing subsequence ending at j plus 1. So, it’s beneficial to include nums[i] in the subsequence.

  4. Updates/Modifications: When the above condition is true, we update dp[i] to be dp[j] + 1. This is because we have found a longer increasing subsequence ending at i. This reflects our objective of finding the longest increasing subsequence possible.

  5. Invariant: The invariant here is the dp list, which maintains for each index i the length of the longest increasing subsequence ending at i. At the start, all elements are initialized to 1, because a single number is itself a subsequence of length 1. At each step of the iteration, dp[i] only gets updated to a higher value, ensuring that it always represents the length of the longest increasing subsequence ending at i.

  6. Final Output: The function returns the maximum value in dp, which represents the length of the longest increasing subsequence of the input array nums. This satisfies our problem’s requirement because we were tasked to find the length of the longest increasing subsequence in the given array.

Coding Constructs

  1. High-Level Problem-Solving Strategies: This code uses Dynamic Programming (DP), a technique for solving complex problems by breaking them down into simpler subproblems and solving each subproblem only once, storing their results to avoid redundant computation. In this case, DP is used to maintain a record of the longest increasing subsequence up to each point in the array.

  2. Purpose of this Code to a Non-Programmer: Think of this code as trying to find the longest sequence of increasing numbers within a list of numbers. It’s like trying to find the longest sequence of increasing ages in a list of ages of a group of people.

  3. Logical Elements or Constructs: The primary constructs used here are iteration (loops) and conditionals (if-else). There is a nested iteration over the array and a conditional check within the inner iteration to determine if extending the current longest increasing subsequence is possible.

  4. Algorithmic Approach in Plain English: Start from the first number and for each number, check all the previous numbers. If the current number is bigger than a previous number and the sequence ending at the previous number is longer, then extend that sequence by the current number. Keep doing this for all numbers in the list. The longest sequence found during this process is the longest increasing subsequence.

  5. Key Steps or Operations on the Input Data: The key operations are the two nested loops iterating over the array and the update operation within the inner loop. The loops are used to compare each number with all its previous numbers, while the update operation modifies the longest subsequence found so far.

  6. Algorithmic Patterns or Strategies: The primary pattern is Dynamic Programming, specifically the Bottom-Up approach where we build up the solution from smaller subproblems to larger ones. The technique of maintaining a DP table (or list in this case) to store intermediate results for reuse is a hallmark of the DP approach. This allows us to avoid recalculating solutions to subproblems we have already solved, leading to significant time savings.

Language Agnostic Coding Drills

  1. Identification of Distinct Concepts

    • Iteration (Looping): It is a concept of doing something repeatedly until a certain condition is met. It is a fundamental concept in programming that is language-agnostic.

    • Conditional Checks (If-Else): It is the concept of making decisions in the code based on certain conditions.

    • Array/ List Manipulation: Handling and manipulation of lists or arrays is another crucial coding concept. This includes creating a list/array, accessing elements, modifying elements, etc.

    • Variable Initialization and Updating: The practice of setting up variables with initial values and then updating those values as necessary.

    • Dynamic Programming (DP): An advanced problem-solving technique where we solve complex problems by breaking them down into simpler subproblems, and store the results of these subproblems to avoid solving them again.

  2. Coding Concepts in Order of Increasing Difficulty with Descriptions

    • Iteration (Looping): [Difficulty: Easy] It’s the simplest and most straightforward concept among these. We often start learning programming with loops as they are fundamental to controlling the flow of the program.

    • Conditional Checks (If-Else): [Difficulty: Easy] Conditional checks are also fundamental to programming and not very difficult to understand or implement. They allow us to control the program flow based on certain conditions.

    • Variable Initialization and Updating: [Difficulty: Easy] This is a fundamental concept that involves setting up variables to store and manipulate data in a program. It’s not complex, but it’s crucial to managing state and change in a program.

    • Array/ List Manipulation: [Difficulty: Intermediate] Handling arrays/lists involves a bit more complexity. It includes understanding how data is stored and accessed, and can involve various operations like accessing specific elements, updating elements, and handling array/list lengths.

    • Dynamic Programming (DP): [Difficulty: Hard] DP is considered an advanced topic in computer science. It requires understanding how to break down problems into subproblems, how to store and reuse solutions to subproblems, and how to build up a solution to a large problem from its subproblems.

  3. Problem-solving Approach Leading to the Final Solution

    • We start with the simplest concept, which is iteration. We have to iterate over the given array to analyze each number, so we apply a loop for that purpose.

    • Within this iteration, we need another iteration to check each number against all its predecessors. This results in a nested loop.

    • Then comes the conditional check. For each pair of numbers (the current number and one of its predecessors), we need to decide whether we can extend the subsequence ending at the predecessor number with the current number. This decision is based on whether the current number is greater than the predecessor and whether the subsequence at the predecessor is longer than any we’ve seen before.

    • This conditional check leads us to the concept of variable initialization and updating. We initialize a list to hold the length of the longest increasing subsequence ending at each position. This list is updated whenever we find a longer subsequence.

    • The final and most complex concept is dynamic programming (DP). We build up our solution by breaking the problem down into subproblems (finding the longest increasing subsequence ending at each position), solve each subproblem only once, and store their results for reuse. This DP approach leads us to the final solution.

Remember, understanding each of these concepts and how they fit together is crucial to comprehending and solving the problem. Each drill serves as a building block to the overall solution.

Targeted Drills in Python

Let’s create Python-based coding drills for each of the identified concepts. For the purposes of this task, we’re assuming basic Python knowledge, such as syntax and basic operations.

  1. Iteration (Looping)

    1
    2
    
    for i in range(10):
        print(i)
    

    This basic loop will print the numbers from 0 to 9.

  2. Conditional Checks (If-Else)

    1
    2
    3
    4
    5
    
    for i in range(10):
        if i % 2 == 0:
            print(f"{i} is even")
        else:
            print(f"{i} is odd")
    

    This code prints whether each number from 0 to 9 is even or odd.

  3. Variable Initialization and Updating

    1
    2
    3
    4
    
    sum = 0
    for i in range(10):
        sum += i
    print(sum)
    

    This code calculates the sum of the numbers from 0 to 9.

  4. Array/ List Manipulation

    1
    2
    3
    4
    
    list = [i for i in range(10)]
    for i in range(len(list)):
        list[i] *= 2
    print(list)
    

    This code doubles each number in the list.

  5. Dynamic Programming (DP) (Problem-Specific)

    1
    2
    3
    4
    5
    6
    7
    
    list = [10, 22, 9, 33, 21, 50, 41, 60, 80]
    DP = [1]*len(list)
    for i in range(len(list)):
        for j in range(i):
            if list[i] > list[j] and DP[i] < DP[j] + 1:
                DP[i] = DP[j] + 1
    print(max(DP))
    

    This code finds the length of the longest increasing subsequence in a list.

Now, let’s discuss how these drills can be combined to solve our initial problem:

  1. We start by initializing our DP list (Drill 3 and Drill 4).
  2. Then we create our main loop to iterate over the given list (Drill 1).
  3. Within this loop, we create another loop to check each number against its predecessors (Drill 1 again).
  4. Then we make a conditional check to decide whether we can extend the subsequence ending at the predecessor number with the current number (Drill 2).
  5. If the condition is true, we update our DP list (Drill 3).
  6. Finally, after our loops finish executing, we find the maximum value in our DP list, which represents the length of the longest increasing subsequence (Drill 4). This is the answer to our problem.

By breaking down the solution into these drills, we’ve made the problem more manageable and easier to understand. Each drill represents a piece of the overall solution and fits together like a puzzle to solve the problem. This way, even complex problems can become approachable with a little patience and a systematic breakdown.

Q&A

Test case [4,10,4,3,8,9] my output 3 (4 8 9 / 3 8 9) but its failing as expected result is 4 how?

I apologize for misunderstanding the requirement. If the sequence must be strictly increasing, then equal elements cannot be in the same subsequence.

In that case, let’s reevaluate the array [4,10,4,3,8,9]:

  1. [4] is our starting point.
  2. 10 is greater than 4, so append it: [4, 10].
  3. 4 is not greater than 10, so we start a new subsequence: [4].
  4. 3 is not greater than 4, so we start a new subsequence: [3].
  5. 8 is greater than 3, so we append it: [3, 8].
  6. 9 is greater than 8, so we append it: [3, 8, 9].

Here, the longest strictly increasing subsequence is [3, 8, 9], which has a length of 3.

The expected result should indeed be 3, not 4. I apologize for the confusion in the previous explanation. The longest strictly increasing subsequence doesn’t allow for equal values. Thank you for your patience.

I think the hardest part of dynamic programming here is the initialization of dp array. I just set dp[0] to 1.

The initialization of the DP array can indeed be tricky in dynamic programming problems because it forms the base of your problem solution.

In this problem, the DP array dp is used to keep track of the length of the longest increasing subsequence ending at each index i.

You’ve correctly initialized dp[0] to 1, because the length of the longest increasing subsequence ending at the first element is always 1 (the element itself).

The rest of the array should also be initialized to 1 because at minimum, each element itself can be a subsequence of length 1. So initializing the DP array as:

1
dp = [1]*len(nums)

makes sense, because it gives a starting point for the DP solution. It states that before checking any other elements, each element itself forms a strictly increasing subsequence of length 1.

After initialization, the DP array gets updated while iterating over nums. For each element at index i, you compare it with all elements at previous indices j. If nums[i] is greater than nums[j], that means you can extend the subsequence ending at j to i. If this new subsequence is longer than any previously found subsequences ending at i, you update dp[i].

Hence, the initialization of the DP array is an essential step in solving this problem using dynamic programming.

The purpose of my post is to improve my ability to express the key idea of the problem in English, which can be helpful for my interviews. Please let me know if there is anything wrong.

The LIS problem can be solved by dynamic programming. The transition equation is dp[i] = max(dp[j]) + 1, in which nums[j] < nums[i]. For each number, we need to iterate through the previous numbers to find the maximum dp, so overall time complexity is O(n^2).

We can use binary search to get an O(nlog(n)) algorithm. We use an array to store the minimum tail element of each LIS length. For example, tail[i] represents the minimum tail element of LIS whose length is i + 1. To maintain the attribute of the tail array, when we iterate through the nums array, let’s say current number is cur. If tail[i - 1] < cur < tail[i], we can replace the tail[i] with cur because it’s less than tail[i] and it can also form an LIS whose length is i + 1, so we can make sure the numbers stored in tail array are minimum.

We can prove the tail array is increasing. If tail[i - 1] > tail[i], we can replace tail[i - 1] with the second element from tail of LIS whose length is i + 1 to make tail[i - 1] smaller. So we can use binary search to find the appropriate position of current number in tail array.

Your explanation for both the dynamic programming and binary search solutions is correct and well-structured. Good job!

Here are a few minor suggestions to improve your description:

  1. Dynamic Programming: Instead of just saying “The transition equation is dp[i] = max(dp[j]) + 1”, consider adding a bit more context. For instance, you could say, “For every number at index ‘i’, we iterate through all the previous numbers. If the current number is greater than a previous number (let’s call it ‘j’), it means that we can extend the increasing subsequence that ends with the number at ‘j’. Therefore, the transition equation is dp[i] = max(dp[j]) + 1, for all j such that nums[j] < nums[i].”

  2. Binary Search: You’ve explained the tail array well, but you could also explain how the length of this tail array at the end of the process gives the length of the LIS. For example, you could say, “By maintaining the tail array in this way, the final length of the tail array gives us the length of the longest increasing subsequence. This is because each index ‘i’ of the tail array represents an increasing subsequence of length ‘i+1’, and the value at tail[i] is the smallest tail element we could find for such a subsequence.”

These small tweaks can add more clarity and help a listener or reader understand your explanation more quickly and easily. Overall, you’ve done a great job of summarizing these complex algorithms!

Here is an confusion fo mine when understanding what really the dp[i] is in the method, hope it might help the people who face the same problem like me.

When you do DP/Max-Travese method, actually your steps is like this: First, give a start, here ‘[1] * len(nums)’, becasue for each points in array, at least itself is a increasing subsequence; Then, use DP/Max-Travese method to fill in the DP[] step by step, here comes what confused me.

**What really DP[i] represents? ** You might say, ‘‘DP[i] represents LIS of the array[start to i]’’, but this kind of saying is only partly right, which is the key point of confusing, it needed to be added a constrain, together should be like this, ‘‘DP[i] represents ‘fake LIS’ which ended by array[i] of the array [start to i]’’.

Now let me explain with an example array, [2,5,4,3,1], and the related DP array is dp[], if you consider the ‘dp[i] represents LIS of the array[start to i]’, the related dp[] should be,[1,2,2,2,2]; if you consider the ‘dp[i] represents ‘fake LIS’ which ended by array[i] of the array[start to i]’, dp[], should be [1,2,2,2,1], and only by this is realy what the code says.

This ‘fake LIS’ is really important, for it can help you to store an ‘fake LIS’ ended by array[i], so you can use it later to see if you can get a longer array after. But a ‘real LIS’ can’t help store the result needed.

Your understanding is correct, and the distinction you made is key to understanding how this dynamic programming solution works.

As you pointed out, dp[i] does not represent the longest increasing subsequence (LIS) of the array from the start to i. Instead, it represents the length of the longest increasing subsequence that ends at nums[i]. This is an important distinction because it allows us to build upon previously computed values to find the LIS for larger subsequences.

This “building up” is a hallmark of dynamic programming, where we break a problem down into smaller subproblems, solve those subproblems, and then use those solutions to solve the larger, original problem. In this case, by finding the longest increasing subsequences that end at each index i, we can efficiently find the longest increasing subsequence for the entire array.

Let’s look at an example: Given the array nums = [2, 5, 3, 7, 101, 18], the dp array would be [1, 2, 2, 3, 4, 4]. When we’re at index 2 (nums[2] = 3), the longest increasing subsequence ending at index 2 is [2, 3] with length 2 (dp[2] = 2). Then when we move on to index 3 (nums[3] = 7), we can build upon the previous LIS ending at index 2, forming a new LIS: [2, 3, 7] with length 3 (dp[3] = 3). And so on…

The idea is not to keep track of the entire subsequence but rather its length, so we can later find the maximum length among all such subsequences.

So, your confusion is justified, and you’ve reached the right conclusion! Your explanation will likely be helpful for others who are also learning dynamic programming and trying to understand how it’s applied in this problem. Great job!

The first time I see the O(n log n) solution I really wonder where the design auxiliary array comes from and how one may even hypothesize the array to be monotonically increasing. But later I figured out there’s a path to evolve from the O(n^2) to O(nlogn) solution.

Step1: the “naive” DP The O(n2) DP maintains an array L, with L[k] storing the max length of longest increasing subsequence (abbreviated as LIS below) ending with the k-th element of input. So the optimal substructure is L[k] = max {L[j]+1 | j < k and nums[j] < nums[k]}.

Step2: DP with a shorter auxiliary array

The key of this step is to examine what may update L[k], and realize: there could be multiple indices j of the same L[j], and the LIS with the smallest tail suffice to render a candidate LIS with length L[j]+1.

This implies that the O(n2) DP is wasting space, and hints that we should turn to cache ones causing most L[k] to potentially update – they happen to be those LIS with smallest tail of each length. This gives us a new formulation of optimal substructure formulating the min tail of increasing subsequence in nums[1…k] of length l, which we denote as T[k, l]. It is trivial to make it a 1-D update by only maintaining the last two rows of T.

Now we get a DP of complexity O(n * L[n]). It is already better than the naive DP with a long input whose increasing subsequences are all short. But there is more space to improve.

Step3: The O(nlogn) solution A natural question to ask, if we wouldd want to optimize the above DP further, is when T[k, l] is updated to num[k]. It is easy to find out that this happens when:

T[k-1, l-1] < nums[k] < T[k-1, l]

The left < ensures nums[k] can be appened after T[k-1, l-1], while the right “<” makes an update.

This leads us to wonder about occasions where the updated positions is sparse and easy to search – when the row T[k-1, ] is monotonic we can do binary search! At this point it is all about make believe first and then prove this is indeed true.

Your thinking process and your evolution from the O(n^2) solution to the O(n log n) solution is excellent! It beautifully illustrates how an efficient solution can be discovered step by step by continuously questioning and refining our understanding of the problem.

The step from “DP with a shorter auxiliary array” to “The O(n log n) solution” is a great insight. Noticing the fact that T[k, l] only updates when T[k-1, l-1] < nums[k] < T[k-1, l] and realizing that if T[k-1, ] is monotonic, we can binary search is a powerful observation.

The process of refinement you’ve demonstrated here is an essential skill in algorithm design and problem solving. The ability to think about a problem abstractly, come up with a hypothesis, and then test and refine that hypothesis, is a valuable ability in algorithmic problem solving and software development in general.

Your analysis will surely help others who are working through similar problems and trying to understand how to approach dynamic programming problems and improve their solution’s efficiency. Great job!

Let us break it down further:

Step 1 - Naive Dynamic Programming (DP) Approach: In the first step, we start with a very straightforward method. We use dynamic programming to keep track of the longest increasing sequence that ends with each number in our array (nums). We do this by maintaining an array (L) where L[k] is the maximum length of the increasing sequence ending with the k-th number in nums. We can find the value for L[k] by looking at all the previous entries in L where the corresponding number in nums is smaller than the k-th number and choosing the maximum among them.

Step 2 - A More Refined Dynamic Programming Approach: In this step, we make an observation that helps us to improve our algorithm. We notice that when updating L[k], the only previous L[j]s that matter are those that belong to the longest increasing sequences with the smallest end number. Thus, we can save space by only keeping track of these sequences. This leads us to create a new array, T, where T[k, l] is the smallest tail number of an increasing sequence of length l in the first k numbers of nums. Note that by only keeping the last two rows of T, we can reduce the space complexity to linear.

Step 3 - Optimized Approach with Binary Search: In the third and final step, we notice that we only update T[k, l] to nums[k] if T[k-1, l-1] is smaller than nums[k], which is smaller than T[k-1, l]. This is because nums[k] can follow a sequence ending with T[k-1, l-1] and, being smaller than T[k-1, l], it could potentially start a longer increasing sequence. Importantly, we see that each row in T is sorted, or ‘monotonic’. This means we can use binary search, a very fast search algorithm, to find where to place each number. This brings the time complexity down to O(n log n), a significant improvement over the naive approach!

Please note that this is a relatively high-level explanation. Each step involves complex thoughts and careful design. It is quite normal to find this challenging at first, but with practice and study, these concepts will become clearer.

The problem descriptions says:

Given [10, 9, 2, 5, 3, 7, 101, 18]. The longest increasing subsequence is [2, 3, 7, 101], therefore the length is 4.

However, this is incorrect. The provided subsequence is missing the 5 that has mysteriously disappeared. In fact, the longest increasing subsequence here is is [3, 7, 101] of length 3.

Am I missing something or is the question phrased inappropriately?

Your confusion might be related to how “increasing subsequence” is defined in the problem.

The longest increasing subsequence (LIS) in this problem doesn’t have to be a consecutive subarray from the original array, but a sequence of numbers where each number is strictly larger than the previous one, and these numbers occur in the same order as in the original array.

So for the given array [10, 9, 2, 5, 3, 7, 101, 18], the longest increasing subsequence is indeed [2, 3, 7, 101] of length 4.

Here’s why:

  • The number 2 is the first element in this subsequence.
  • The number 3 comes after 2 in the original array and is larger than 2.
  • The number 7 comes after 3 in the original array and is larger than 3.
  • The number 101 comes after 7 in the original array and is larger than 7.

So this forms an increasing subsequence. And there is no other subsequence in this array that has more than 4 elements and each subsequent element is larger than the previous one. Therefore, [2, 3, 7, 101] is the longest increasing subsequence.

The sequence [3, 7, 101] that you suggested is also an increasing subsequence, but it is not the longest. This sequence only has 3 elements while the sequence [2, 3, 7, 101] has 4 elements.

what if I ask: Longest Non-decreasing Subsequence?

The problem of finding the Longest Non-decreasing Subsequence (LNDS) is quite similar to the Longest Increasing Subsequence (LIS) problem, but with a slight difference: in LNDS, the next element in the sequence can be equal to or greater than the previous element, whereas in LIS, the next element must be strictly greater than the previous one.

The strategy to find LNDS would remain the same as for LIS, with the difference being in the comparison. In the case of LIS, you update your dynamic programming (DP) table when you find an element greater than the current one, whereas, for LNDS, you would update your table when you find an element that is equal to or greater than the current one.

The algorithm would initialize a DP array of the same length as the input array, fill it with 1 (because a single element is a non-decreasing sequence), and then for each pair of indices (i, j) with i < j, if array[j] >= array[i], it would set DP[j] to be max(DP[j], DP[i]+1), trying to extend the longest non-decreasing subsequence ending at i by the element at j.

After you traverse the entire array, the maximum value in the DP table would be the length of the Longest Non-decreasing Subsequence. Note that for this to work, the entire DP array needs to be evaluated, as the LNDS may end at any position in the input array.

The time complexity for this approach would still be O(n^2), where n is the length of the input array, because in the worst case, you are comparing each element with every other element in the array. The space complexity would be O(n), which is for the DP array.

Again, it’s crucial to note that the LNDS doesn’t have to be a contiguous subarray, and it is a subsequence where each element is either equal to or greater than the previous one.

why is n*n? Space complexity : O(n^2) memo array of size n * n is used. What does mean memo array? the stack used in recursion?

The dynamic programming solution for the Longest Increasing Subsequence (LIS) problem typically uses a one-dimensional DP array of size n, where n is the size of the input array. Each cell in the DP array represents the length of the longest increasing subsequence ending at that position. This approach has a space complexity of O(n).

The time complexity is O(n^2) because, for each element in the array, we potentially compare it with every element before it. This leads to n*(n-1)/2 comparisons in the worst case, which simplifies to O(n^2).

If we talk about a memo array of size nn, it seems we are discussing a different problem or approach. For instance, in problems involving two sequences or strings (like the Longest Common Subsequence problem), we often use a two-dimensional memo table of size nm (where n and m are the sizes of the two input sequences). Each cell in this 2D memo table typically represents a subproblem involving a prefix of each sequence.

If you’ve seen an n*n memo array used for the LIS problem, it could possibly be from a variant of the problem or a different solution approach.

So, this question is asking for just the length of the LIS, but how would you go about returning the actual LIS? The solution that everyone seems to have for O(n logn) to just find the length seems to replace some of the values in the LIS array when small values are iterated over. Example: nums: [10, 2, 3, 1] should have LIS of [2, 3], but because you replace the 2 with this method, you would get [1,3] which is not the correct answer in the case of the actual LIS.

In order to reconstruct the actual Longest Increasing Subsequence (LIS), we would need to maintain some additional information while performing our dynamic programming solution. Here is a strategy using a 1-D array dp for lengths and a 2-D array parents for tracking sequences.

Here’s a Python solution implementing this approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def lengthOfLIS(nums):
    if not nums:
        return []

    n = len(nums)
    dp = [0] * n    # Initialize the array to store the length of LIS ending at each position
    parents = [0] * n    # Array to store the parent index in the LIS

    # Every isolated number can be considered as a LIS of length 1
    dp[0] = 1
    parents[0] = -1    # No parent for the first number

    maxLength = 1
    maxIndex = 0

    # Compute optimized LIS values in a bottom-up manner
    for i in range(1, n):
        dp[i] = 1
        parents[i] = -1    # Initialize parent as -1 for each number

        for j in range(i):
            if nums[i] > nums[j] and dp[i] < dp[j] + 1:
                dp[i] = dp[j] + 1
                parents[i] = j    # Store the parent index for the LIS

        # Update maxLength to ensure it holds the maximum LIS length
        if maxLength < dp[i]:
            maxLength = dp[i]
            maxIndex = i

    # The maxLength is the length of the LIS. Let's reconstruct the LIS.
    lis = []
    current = maxIndex
    while current >= 0:
        lis.append(nums[current])
        current = parents[current]

    return lis[::-1]    # Reverse the list to get the correct order of the LIS

# Test
nums = [10, 9, 2, 5, 3, 7, 101, 18]
print(lengthOfLIS(nums))  # Output: [2, 5, 7, 101]

The overall time complexity of this approach is O(n^2), where n is the size of the input array nums. The space complexity is also O(n), considering the DP array and the parents array. This method is more time and space consuming than the O(nlogn) length-only method, but it allows us to retrieve the actual LIS.

This probles is similar to other probles with sliding window approach but, the description says:

“Given an unsorted array of integers, find the length of longest increasing subsequence.”

For the first case the input is [10,9,2,5,3,7,101,18]

the longest increasing subsequence of that array is [3,7,101] which length is 3

But the test case expects a 4 And the animation of the solution shows that the input is [10,9,2,5,3,7,101] and the solution is [2,5,3,7,101], 4

This solution does not match the input of the test case and ignores the number 5 breaking the increasing sequence. The description does not say that it should be “strictly increasing”, so lets say we are allowed to sort all numbers in the window , so the 5 in the sequence won’t break the condition. But following that approach it means you could fit the entire array [10,9,2,5,3,7,101] into that condition. so the answer sould be 7

Is there something that i am missing?

It seems there’s a misunderstanding about the definition of the longest increasing subsequence (LIS). The “increasing subsequence” does not need to be a contiguous subarray. Instead, it’s a subsequence where each element is greater than the previous one. Also, the term “strictly increasing” means each element in the sequence must be larger than the one before it. This implies that same numbers in sequence do not count.

For the given input array [10,9,2,5,3,7,101,18], the longest increasing subsequence is [2,5,7,101], which is of length 4.

Here’s the breakdown:

  • Start with 2: [2]
  • 5 > 2, so append 5: [2,5]
  • 3 < 5, so it’s not included. The sequence remains: [2,5]
  • 7 > 5, so append 7: [2,5,7]
  • 101 > 7, so append 101: [2,5,7,101]
  • 18 < 101, so it’s not included. The final sequence is [2,5,7,101]

The sequence [2,5,7,101] is the LIS and its length is 4. Note that the LIS is not necessarily unique. For example, [2,3,7,101] is also a valid LIS for the array.

The sliding window approach typically applies to contiguous subarray problems, but the LIS problem doesn’t fall into this category, hence we use dynamic programming or binary search-based solutions here.

HELP - Tabulation

Hi can someone please help me I’m getting issues while converting my this memoized working code to tabulation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
public int lengthOfLIS(int[] nums) {
int n = nums.length;
int dp[][] = new int[n][n+1];
for(int row[]: dp)
Arrays.fill(row, -1);
return lengthOfLIS(nums, n-1, n, dp);
}
int lengthOfLIS(int[] nums, int i, int maxI, int dp[][]) {
if(i<0) return 0;

    if(dp[i][maxI-1]!=-1) return dp[i][maxI-1];

    if(maxI==nums.length || nums[i]<nums[maxI]){
        return dp[i][maxI-1] = Math.max(1+lengthOfLIS(nums, i-1, i, dp), lengthOfLIS(nums, i-1, maxI, dp));
    }
    return dp[i][maxI-1] = lengthOfLIS(nums, i-1, maxI, dp);
}

Converting a memoized (top-down) DP solution to a tabulation (bottom-up) DP solution often involves rethinking the flow of your solution. A tabulation DP approach typically fills up a DP table iteratively, starting from the base cases and working its way up to the final solution.

Here is how you can convert your memoized solution to a tabulation solution:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
public int lengthOfLIS(int[] nums) {
    int n = nums.length;
    int[] dp = new int[n];
    Arrays.fill(dp, 1);  // each element is a subsequence of length 1

    for (int i = 1; i < n; i++) {
        for (int j = 0; j < i; j++) {
            if (nums[j] < nums[i]) {
                dp[i] = Math.max(dp[i], dp[j] + 1);
            }
        }
    }

    int maxLength = 0;
    for (int i = 0; i < n; i++) {
        maxLength = Math.max(maxLength, dp[i]);
    }

    return maxLength;
}

In the above code:

  • We first initialize all elements of dp to 1 since a single element is always a valid subsequence of length 1.
  • For each number, we iterate through all previous numbers. If the current number nums[i] is greater than the previous number nums[j], we can extend the subsequence ending at j to a longer subsequence ending at i. We update dp[i] to be the maximum of its current value and dp[j] + 1.
  • Finally, we return the maximum value in the dp array, which represents the length of the longest increasing subsequence.

True BF solution would be to enumarate all subsequences, then search through them to find the longest increasing one.

Yes, that’s correct. In a brute force approach to the longest increasing subsequence problem, you would generate all possible subsequences of the given sequence and then check each subsequence to see if it is increasing. This approach is highly inefficient, however, and would take O(2^n) time, where n is the length of the sequence. This is because there are 2^n possible subsequences of a sequence (each element can be either included or excluded), and for each subsequence, you would need to check if it’s increasing, which takes O(n) time in the worst case.

For example, if you have the sequence [10, 9, 2, 5, 3, 7, 101, 18], the brute force approach would generate all 256 possible subsequences, then scan through each subsequence to find the longest increasing one.

Given the inefficiency of this approach, dynamic programming or other methods are usually used to solve this problem more efficiently. These methods can solve the problem in O(n^2) or O(n log n) time, which is much faster for large sequences.

Is there a problem here on LeetCode to find the actual longest increasing subsequence (values of the subsequence) instead of the length of LIS? For Example: Input: nums = [10,9,2,5,3,7,101,18] Output: [2,3,7,101]

The problem “Longest Increasing Subsequence” (problem 300 on LeetCode) asks only for the length of the longest increasing subsequence, not the sequence itself.

However, with a modification, the common dynamic programming solution for problem 300 can be adjusted to also return the longest increasing subsequence itself, not just its length. Here’s a sketch of how you could do it:

  1. In addition to maintaining a dp array that tracks the length of the longest increasing subsequence ending at each index, maintain a second array that stores the actual subsequences.

  2. When updating dp[i], also update the second array to store the new longest subsequence ending at i.

  3. At the end, the longest increasing subsequence will be in the array position corresponding to the maximum value in the dp array.

However, note that this approach requires additional space to store the subsequences, and may involve more complex logic if there are multiple subsequences with the maximum length.

Just curious to understand, can we use a monotonic decreasing stack here ?

No, a monotonic decreasing stack would not be useful for solving the Longest Increasing Subsequence (LIS) problem.

The reason is that the LIS problem requires us to find the longest subsequence where each element is larger than the previous one, not smaller. A monotonic decreasing stack is used when we want to maintain a stack of elements where each successive element is smaller than or equal to the previous one. This is opposite to what we want for the LIS problem.

Moreover, the LIS problem requires us to look at all the previous elements for each current element, not just the most recent one, which is what a stack provides. Therefore, the structure of a stack, whether decreasing or increasing, does not lend itself to a solution for the LIS problem.

Instead, dynamic programming is the standard approach for solving the LIS problem, as it allows us to store and update the lengths of the longest increasing subsequences ending at each index in the array. Alternatively, binary search can be used to achieve a more optimized solution with time complexity O(n log n).

why is everyone’s memoization solution going from 0 till n . How to solve this is n to 0 using dp bcz I am having some taugh time handling the prevInd =-1 when I start that way.

Starting from n and going to 0 might indeed cause problems with initializing prevInd = -1. When we start from the beginning (0 to n), it’s easy to set prevInd = -1 because we haven’t encountered any elements yet. But if we start from the end, we would need to have some logic to handle the first iteration differently.

Here’s an alternative approach which might help. In this approach, we start from 0 but instead of considering the subsequence ending at i, we consider the subsequence starting at i. This way, when i equals n-1, we will only have one element left, and the longest increasing subsequence starting at this point is of length 1.

Here’s the code for the bottom-up dynamic programming solution:

1
2
3
4
5
6
7
8
def lengthOfLIS(nums):
    n = len(nums)
    dp = [1] * n
    for i in reversed(range(n-1)):
        for j in range(i+1, n):
            if nums[i] < nums[j]:
                dp[i] = max(dp[i], 1 + dp[j])
    return max(dp)

In this code, dp[i] stores the length of the longest increasing subsequence starting at i. We iterate i from n-2 to 0 (since dp[n-1] is trivially 1), and for each i, we iterate j from i+1 to n-1. If nums[i] < nums[j], we update dp[i] to be the maximum of the current dp[i] and 1 + dp[j].

Finally, we return the maximum value in dp as the length of the longest increasing subsequence.

Note that this approach still has a time complexity of O(n^2), so it is not as efficient as the binary search method, which has a time complexity of O(n log n).

int func(int i,int prev,vector nums){

if(i<0) return 0;

//not take
int a = func(i-1,prev,nums);
int b = 0;
//take
if(prev == -1 || nums[prev]>nums[i] ){
b = func(i-1,i,nums) + 1;
}

return max(a,b);
}

This is the recursion I wrote and it works fine, but i can’t solve the 2D tabulated version with i looping from 1 to n. Can someone please provide me with the logic to write the tabulated version for the same.

Your recursion logic is perfectly fine, it seems that you are considering whether to include the ith element in the longest increasing subsequence (LIS) or not. When you try to convert it to the tabulation (bottom-up dynamic programming), you can use a 2D table where dp[i][j] represents the length of the LIS ending at index i with j as the previous index.

Here’s the corresponding bottom-up dynamic programming (tabulation) code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
int lengthOfLIS(vector<int>& nums) {
    int n = nums.size();
    // dp[i][j] represents the LIS from 0 to i with j as previous index
    vector<vector<int>> dp(n, vector<int>(n, 0));
    int maxLength = 0;
    for (int i = 0; i < n; ++i) {
        dp[i][i] = 1;  // Single element is always a LIS
        for (int j = 0; j < i; ++j) {
            if (nums[i] > nums[j]) {
                dp[i][j] = max(dp[i][j], dp[j][i-1] + 1);
            } else {
                dp[i][j] = max(dp[i][j], dp[j][i-1]);
            }
        }
        maxLength = max(maxLength, dp[i][i]);
    }
    return maxLength;
}

In this code, dp[i][j] represents the length of the longest increasing subsequence that ends at index i with j as the previous index. We initialize each dp[i][i] as 1 since a single element is always a LIS. Then, for each index i from 1 to n-1, and for each j from 0 to i-1, if nums[i] is greater than nums[j], we update dp[i][j] to be the maximum of the current dp[i][j] and dp[j][i-1] + 1. If nums[i] is not greater than nums[j], we simply update dp[i][j] to be dp[j][i-1]. Finally, we update maxLength with the maximum value of dp[i][i].

Please note that this solution has a time complexity of O(n^2), where n is the size of the input array.

There are more LISs in the 1st example: [2,5,7,101], [2,3,7,18], [2,5,7,18].

That’s correct! The problem is asking for the length of the longest increasing subsequence (LIS), not for all possible longest increasing subsequences.

In the array [10, 9, 2, 5, 3, 7, 101, 18], the sequences [2, 5, 7, 101], [2, 3, 7, 18], and [2, 5, 7, 18] are indeed all longest increasing subsequences with a length of 4. The solution does not have to specify which of these subsequences is the correct answer, as they are all correct. The primary goal is to find the length of the LIS, which in this case is 4.

It’s important to note that when it comes to longest increasing subsequences, there may be more than one valid sequence for a given array. It all depends on the specific problem statement whether it asks for any one longest subsequence or all such subsequences.

So, while the original problem statement only asks for the length of the longest increasing subsequence, your observation that there are other sequences of the same length is absolutely correct!

Does anybody know about problem that uses similar approach to this question, i.e. the NlogN approach of creating piles and using binary search to locate and insert in that piles? I’ve done similar problem in the past but have been tirelessly searching for the problem without success. Any help would be appreciated. Thanks!

Yes, there are several problems that use a similar NlogN approach with binary search to find solutions. Here are a few examples:

  1. “Russian Doll Envelopes” (LeetCode 354): This problem requires sorting a list of envelopes by width and height, then finding the longest increasing subsequence based on the heights.

  2. “Minimum Number of Arrows to Burst Balloons” (LeetCode 452): The problem can be solved by sorting the balloons by their end coordinates, then applying a greedy approach similar to finding the longest increasing subsequence.

  3. “Longest Increasing Path in a Matrix” (LeetCode 329): Although this problem uses Depth-First Search (DFS), it also involves finding the longest increasing path in a matrix, which can be thought of as a form of longest increasing subsequence.

  4. “Maximum Length of Pair Chain” (LeetCode 646): This problem requires sorting the pairs and then finding the longest chain, which is similar to finding the longest increasing subsequence.

  5. “Box Stacking” problem: In this classic dynamic programming problem, boxes need to be stacked such that each box is smaller in size than the box below it. This involves sorting and then finding the longest increasing subsequence.

Remember, in all these problems, the key idea remains the same: sorting the input in a meaningful way, and then finding the longest increasing subsequence.

I think I am not able to solve this problem in interviews. This problem is hard even binary solution is hard to find.

It’s completely normal to find this problem challenging. The Longest Increasing Subsequence (LIS) problem is considered one of the more difficult problems to tackle, especially under the time pressure of an interview.

Here are a few tips to help you prepare for similar problems:

  1. Understand the problem: Ensure you fully understand the problem statement and requirements. In the case of the LIS problem, you are asked to find the longest subsequence (not necessarily contiguous) of an array that is in increasing order.

  2. Break down the problem: Try to break down the problem into smaller parts. For instance, start by understanding how to find an increasing subsequence, before trying to find the longest one.

  3. Start simple: It’s often helpful to start with a simple, albeit less efficient solution (like the O(n^2) Dynamic Programming solution for the LIS problem) before you move on to more optimized solutions.

  4. Practice binary search: The O(n log n) solution to this problem makes use of binary search. Make sure you understand how binary search works and practice it on simpler problems first.

  5. Trace small examples: Work through small examples by hand. This will help you understand the problem better and might give you insight into more efficient solutions.

  6. Practice, practice, practice: The best way to get better at solving such problems is through practice. Use platforms like LeetCode, HackerRank, etc., to practice solving various dynamic programming and binary search problems.

Remember, it’s not about how hard the problem is, but about how you approach it. With the right approach and enough practice, you’ll definitely improve over time. Don’t get discouraged!

can we do the question using upper bound()

Yes, the Longest Increasing Subsequence problem can be solved using the upper_bound() function. This is usually part of the efficient O(n log n) solution, where binary search is used to find the position of an element. In C++ STL, upper_bound() performs this binary search for you.

Here’s a simple sketch of how you might use upper_bound() to solve this problem in C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
vector<int> lis;
for (int i = 0; i < nums.size(); i++) {
    // If the current number is greater than the last number in the 'lis' vector
    // it can extend the current longest increasing subsequence
    if(lis.empty() || nums[i] > lis.back()){
        lis.push_back(nums[i]);
    }
    // If not, it can potentially start a new increasing subsequence of same length
    // Thus, we find the position of the first number in 'lis' that is greater or equal to nums[i]
    // and replace it with nums[i]
    else{
        int pos = upper_bound(lis.begin(), lis.end(), nums[i]) - lis.begin();
        lis[pos] = nums[i];
    }
}
// The length of the longest increasing subsequence is the length of 'lis'
return lis.size();

Note that this solution maintains a list of potential candidates for the longest increasing subsequence, rather than the subsequence itself. The actual longest increasing subsequence can be different, but its length will be the same as the length of this candidate list.

Remember that the upper_bound() function returns an iterator pointing to the first element in the range [first, last) that is greater than val. If no such element is found, the function returns last. This function uses binary search and works in O(log n) time.

I hate dp as much as the next guy, but I think this one is the one to really master or memorize one of the strategies off the top of your head. Because I’ve seen so many hard problems this week that are based off this, and it’s kinda annoying if I’ve never figured it out.

It’s understandable to find dynamic programming (DP) challenging. It’s a technique that requires a good understanding of recursion, memoization, and problem decomposition, which can be hard to grasp. But you’re right, many hard problems in computer science and competitive programming can be solved effectively with DP.

The Longest Increasing Subsequence problem is a classic DP problem that serves as the foundation for many other problems. Here are a few tips to help you master this kind of problem:

  1. Understand the problem: Make sure you have a clear understanding of what the problem is asking. In this case, you’re looking for the longest subsequence (not necessarily contiguous) of an array that is in increasing order.

  2. Identify subproblems: Dynamic programming solutions usually break the problem down into smaller subproblems. In this case, the subproblem could be “what’s the longest increasing subsequence ending at each position in the array?”

  3. Define the state and transition: Define the DP state (i.e., the memoization table) to store the results of subproblems. In this case, a 1-D array where dp[i] represents the length of the longest increasing subsequence ending at index i. The transition function is the rule to move from one state to another.

  4. Bottom-up calculation: Start from the simplest subproblems and gradually tackle more complex ones. This often involves nested loops iterating over the DP state.

  5. Trace the answer: The final answer could be in one specific position of the DP table (like dp[n-1] if we have ’n’ states), or we may have to look through our DP table to find the answer.

Practice is key to mastering DP. Start with simpler problems and gradually move on to more complex ones. It may be hard initially, but with enough practice, you’ll get the hang of it.

class Solution {
public:
void util(vector& nums, vector curr, int i, int & maxLen){
if(i == nums.size()){
if(maxLen < curr.size()){
maxLen = curr.size();
}
return;
}
if(curr.size() == 0 || nums[i] > curr[curr.size() - 1]){
curr.push_back(nums[i]);
util(nums,curr,i+1,maxLen);
curr.pop_back();
util(nums,curr,i+1,maxLen);
}
else
util(nums,curr,i+1,maxLen);
}
int lengthOfLIS(vector& nums) {
int maxLen = INT_MIN;
util(nums,{},0,maxLen);
return maxLen;
}
};

why is this maxLen here isn’t getting updated, being it less than curr.size() ?

Your code seems fine and maxLen should be updated correctly as it is passed by reference into the util function.

The code is using a backtracking approach to find the longest increasing subsequence (LIS) in a given vector nums. It recursively checks all possible subsequences and updates maxLen whenever it finds a subsequence that is longer than the current maxLen.

Here’s how it works:

  • util(nums,{},0,maxLen) is called from lengthOfLIS.
  • This call checks if i equals nums.size(). If true, it means we’ve reached the end of the array, so it compares maxLen with the current subsequence length curr.size(). If curr.size() is larger, maxLen is updated to curr.size().
  • If the current index i hasn’t reached the end of the array yet, and if the current subsequence is empty or the current element nums[i] is larger than the last element in the subsequence, it appends the current element to the subsequence and makes a recursive call with i+1.
  • Regardless of whether it appends the current element or not, it also makes a recursive call with i+1 without appending the current element, thus exploring all possible subsequences.

If you’re experiencing issues where maxLen isn’t being updated as expected, you may want to check the input to your function or consider if there are any conditions preventing the updating branch of the code from executing.

To decide if a bigger subsequence can be formed or not for the current element, I only need to know the last element of every subsequence that has been formed before. Also, we can have multiple subsequence for the same LIS size, so if we know the subsequence which has the minimum last element for each size, then it will make our job easier. To do this, we just need to know, the minimum last element for every subsequence size.

Yes, you’re correct. While solving the Longest Increasing Subsequence (LIS) problem, if we keep track of the smallest ending element for each length of increasing subsequence that we have seen so far, we can determine whether or not we can extend any of these subsequences with the current element.

Here is a more concrete explanation:

Consider an array ends, where ends[i] represents the smallest ending element of all increasing subsequences of length i+1 that we have seen so far in the array. The array ends is maintained such that it is always sorted.

When processing each element num in the input array:

  1. If num is larger than the last element in ends, it means num can extend the longest subsequence we have seen so far, and we append num to ends.

  2. If num is not larger than the last element in ends, it means num may not be able to extend the current longest subsequence, but it might be used to extend some other subsequences or form a new subsequence. So, we find a number in ends that is larger than or equal to num, and replace it with num. This ensures that ends[i] always represents the smallest ending number of an increasing subsequence of length i+1. We can use binary search in ends to find the position to replace, which makes it more efficient.

The length of LIS will be the length of ends at the end of the process.

This approach leverages the fact that keeping the ending element as small as possible for all increasing subsequences gives us more chances to extend the subsequences later.

Will memoization solution overcome TLE in this problem?

Yes, using a memoization (also known as “top-down”) approach for dynamic programming problems can often help to significantly reduce the time complexity and avoid a Time Limit Exceeded (TLE) error.

In the case of the Longest Increasing Subsequence (LIS) problem, the naive approach without optimization is O(n^2), where n is the length of the input list. This is because for each element, you would need to check all the elements that come after it to see if it can form a longer increasing subsequence, leading to a quadratic time complexity.

By using memoization, you store the result of each subproblem (i.e., the length of the longest increasing subsequence ending at each position) in a data structure like an array or a hash table. This way, when you encounter a subproblem that you’ve solved before, you can simply look up the result instead of re-computing it, which can save a lot of time.

However, while memoization can generally help to avoid TLE, it’s not guaranteed in all cases. The efficiency gain depends on the nature of the problem and the specific input data. For instance, if the input data does not have many overlapping subproblems, then the memoization approach may not provide a significant speedup.

In the case of the LIS problem, a more efficient approach would be to use a binary search along with dynamic programming, which can reduce the time complexity to O(n log n). This is usually sufficient to avoid a TLE error for this problem, even for large inputs.

I did get why dp(i) = dp(j) + 1 if nums[I] > nums[j]. For example, let us say we have nums = [1,3,5,2,4,3,2].

For nums[3] = 2, we have dp[3] = 3 since the longest increasing subsquence before nums[3] is [1,3,5].

But at nums[4] = 4, why would we have dp(4) = dp(3) + 1 = 3+ 1= 4? The longest subsequence up to nums[4] = 4 is still [1,3,5], which has a length of 3, not 3+1 = 4.

I am really confused with the DP solution. I will really appreciate it if someone can offer some clarification.

Thanks so much in advance.

It seems there’s a slight misunderstanding regarding the nature of the dynamic programming (DP) solution for the longest increasing subsequence (LIS) problem. The key point is that each dp[i] stores the length of the longest increasing subsequence ending at nums[i].

When you compute dp[i], you are considering all indices j before i such that nums[j] < nums[i]. If nums[i] is greater than nums[j], it means that you can extend the increasing subsequence ending at nums[j] by appending nums[i], and hence dp[i] = dp[j] + 1.

In your example, nums = [1,3,5,2,4,3,2], the computation for dp[4] considers the sequence ending at nums[4] = 4. Now, we see that 4 can extend the subsequences ending at nums[1] = 3 and nums[3] = 2. The subsequence ending at nums[1] is [1, 3] and the one ending at nums[3] is [1, 2]. So, the length of the longest subsequence that 4 can extend is 2 (from [1, 3]) + 1 (for 4 itself) = 3. Therefore, dp[4] should be 3, not 4.

So, you’re correct in your assessment. The longest subsequence up to nums[4] = 4 is indeed [1,3,5] and has a length of 3, and not 4.

Dynamic Programming (DP) is suitable for this problem due to its characteristics: it seeks a maximum/minimum solution, and it involves decision-making based on prior decisions, typically found in subsequence-related problems.

In this problem, each element in the input prompts a decision: should we include this number? Using a number may extend an increasing subsequence, but could exclude larger preceding numbers. For instance, in nums = [5, 6, 7, 8, 1, 2, 3], we shouldn’t use 1, 2, or 3, as they would exclude 5, 6, 7, and 8, forming the longest increasing subsequence. DP assists in determining whether or not to use an element.

Dynamic programming problems typically require three key components:

  1. A function or array representing the problem’s solution given a specific state. Here, we use the array ‘dp’, where dp[i] signifies the length of the longest increasing subsequence ending with the ith element. Our state is one-dimensional, represented by index i.

  2. A transition strategy between states, known as a recurrence relation. To find dp[3] from dp[0], dp[1], and dp[2], we extend the subsequence ending at i = 2 by appending nums[3] if nums[3] > nums[2], increasing the length by 1. This applies to nums[0] and nums[1] if nums[3] is larger. The goal is to maximize dp[3] by checking all three, leading to the formal recurrence relation: dp[i] = max(dp[j] + 1) for all j where nums[j] < nums[i] and j < i.

  3. A base case. In this problem, each dp element is initialized to 1, considering each individual element as an increasing subsequence.

We conceptualize the sub-array as a record of the smallest ending value for each sequence length. For instance, for two 3-element sequences [1, 2, 3] and [1, 2, 10], we consider ‘3’ as the smallest ending value for all 3-element sequences. For any subsequent value to extend a 3-element sequence to a 4-element one, it needs to be greater than 3.

In the sub-array, the ith element (1-based) records the smallest ending value for all i-element sequences. As the sequence length increases, so does its ending value.

When a new value is introduced, we check whether it can extend an existing sequence. For instance, if the sub-array has 5 elements, we first compare the new value with the 5th value in the sub-array. If the new value is larger, it can extend the sequence to a 6-element one, and we add this new value as the 6th value in the sub-array. However, if the new value is less than or equal to the 5th value in the sub-array, we try extending the 4-element sequences.

The process repeats, examining decreasing sequence lengths until we find a sequence that the new value can extend. For instance, after comparing with the 5th, 4th, and 3rd values in the sub-array, if the 2nd value is the first one less than the new value, we know the new value can extend a 2-element sequence. This new value will be a smaller ending value for all 3-element sequences, so we update the 3rd value in the sub-array to the new value.

Just curious, with the binary search approach mentioned in this article, for the input [1,3,5,2,6], it will give us sub [1,2,5,6], which is not a subsequence of the original list. Don’t we care about that?

The binary search approach aims to find the length of the longest increasing subsequence, not the actual subsequence itself. The purpose of the sub-array in the binary search approach is to keep track of potential “ends” to the subsequences we’re considering, not to track an actual subsequence.

In your example, the sub-array [1,2,5,6] is a representation of potential ends to subsequences of various lengths, but not an actual subsequence from the original list. The length of this sub-array represents the length of the longest increasing subsequence, which is the ultimate goal of this problem.

If you need to reconstruct the actual longest increasing subsequence, you would need a different approach or additional steps beyond the binary search solution.

can we get the longest subsequence values instead of just length using the O(nlogn) solution?

The binary search solution for this problem as it stands only calculates the length of the longest increasing subsequence, not the subsequence itself. This is because the “tails” array, which holds the smallest ending element for all increasing subsequences of a certain length, does not necessarily contain an actual subsequence from the input.

However, it is possible to modify the O(n log n) approach to also track the actual longest increasing subsequence. To do this, you would need to store the predecessors of each element in the “tails” array as you construct it.

Here’s a rough sketch of how you might do this:

  1. Maintain an additional array to track the predecessors. For each element, when you update the “tails” array, also record the predecessor of this element in the “predecessors” array.

  2. When you finish constructing the “tails” array, use the “predecessors” array to backtrack from the last element in the longest increasing subsequence to the first element. This will give you the actual longest increasing subsequence.

Remember that this approach increases the complexity of the solution as you have to maintain an additional “predecessors” array and use it to reconstruct the longest increasing subsequence. This might not be necessary if you’re only interested in the length of the longest increasing subsequence, not the subsequence itself.

I find I get stymied on questions like this trying desperately to find a solution better than O(n^2). Often that implies “You don’t have the right solution, O(n^2) is pretty bad”. I don’t yet know when to recognize that an O(n^2) solution is acceptable, or as sometimes happens, literally the best you can do.

For this, the intuition “scan the whole list for each element in the list” is pretty trivial, yet that ends up being the given solution, with some improvement. I thought for sure that couldn’t be right. I understand all the listed solutions, the binary search on building a list is a clever improvement, but it still felt like I had the wrong idea to begin with. Just more practice needed I guess

Your thoughts are valid and resonate with many who are learning and practicing dynamic programming (DP) and algorithmic problems. Here are a few things to consider:

  1. Acceptability of O(n^2): Not all problems can be solved in linear or log-linear time. There are many problems, especially those involving relationships between elements like this one (longest increasing subsequence), where an O(n^2) solution is not just acceptable, but might be the best known solution without additional assumptions or constraints. Understanding when quadratic time is acceptable or the best we can do comes with practice and experience. In this case, as you correctly pointed out, the problem involves comparing each element with all other elements, which is a classic sign that an O(n^2) approach might be required.

  2. Need for better than O(n^2): While it’s good to strive for efficient solutions, it’s important to remember that big-O notation describes worst-case complexity. There can be many problems where an O(n^2) solution performs well on average, or where the input size n is small enough that even a quadratic time solution is fast enough.

  3. Learning from solutions: It’s absolutely okay to not come up with the optimal solution on the first try. The key thing is to learn from the given solutions, understand the principles and techniques used (like DP and binary search in this case), and apply them to future problems. Over time, this will help you develop a better intuition for identifying solution approaches.

  4. Practice: Finally, as you’ve noted, more practice is always helpful. The more problems you solve, the better you’ll get at recognizing patterns and knowing when a certain time complexity is acceptable or the best you can do. Consider solving more problems involving DP and sequence comparison, as these tend to have O(n^2) solutions.

Don’t be too hard on yourself. It’s a learning process, and with practice, you’ll get better over time!

How can we intuitively explain the binary search approach to an interviewer when the resulting sub list is not a correct ordering?

Does anyone know why that answer works in a way that is explainable?

The binary search approach used in the problem to find the longest increasing subsequence might seem counterintuitive because the subsequences generated throughout the process might not actually be subsequences of the original array. However, they’re instrumental in determining the longest possible length of an increasing subsequence. Here’s an intuitive explanation:

  1. The key idea is to keep track of the minimum possible tail element for all increasing subsequences of any length we encounter. This idea is based on the observation that a smaller tail element gives us a higher chance of appending a larger number later, hence prolonging the subsequence.

  2. Now, we maintain a dynamic array or list where the index represents the length of the subsequence and the value at the index represents the smallest tail element possible for that length.

  3. We go through each element in the array. For each element, we determine if we can append it to an existing subsequence (i.e., if it’s greater than the current largest tail element) or if it can serve as a better tail element for an existing subsequence of certain length (i.e., it’s not the largest, but it can replace a larger tail in the list).

  4. We use binary search in this list to find the appropriate position of the current element as it is sorted due to the nature of the problem. If the element is larger than all tail elements, we extend our list. Otherwise, we use the element to replace the first tail element in the list that is larger.

  5. At the end of this process, the length of this list gives us the length of the Longest Increasing Subsequence.

This method might generate some subsequences which are not true subsequences of the original array. But it ensures that the length of the longest increasing subsequence is correctly determined, which is the main objective of the problem. In other words, this approach maintains the “potential” for a longest increasing subsequence without ensuring that such a subsequence exists within the original input.

Can someone help because my top down solution is not passing TLEd? I’m not sure why it doesn’t work:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution:
def lengthOfLIS(self, nums: List[int]) -> int:
memo = {}
# n = len(nums)

    def dp(idx: int, max_num: float):
        nonlocal memo
        key = (idx, max_num) 
        print(key)
        if idx == len(nums):
            return 0
        if memo.get(key): 
            return memo[key]
        curr = nums[idx]
        
        w1 = 0
        if max_num == None or curr > max_num: 
            w1 = dp(idx + 1, curr) + 1
        
        way2 = dp(idx + 1, max_num)

        memo[key] = max(w1, way2)
        return memo[key]
    
    return dp(0, None)`

Your solution is indeed a top-down dynamic programming (DP) approach, but it is quite inefficient, leading to a Time Limit Exceeded (TLE) error. This is because for every element, you are considering two options: whether to include it in the LIS or not, and then recursively solving the rest of the problem. This leads to a lot of overlapping subproblems, as you are solving the same subproblem multiple times.

Additionally, you are creating a memoization key using the index and the previous maximum number, which might not be efficient enough due to the floating point comparisons involved.

Here is an optimized version of your code using Dynamic Programming:

1
2
3
4
5
6
7
8
9
class Solution:
    def lengthOfLIS(self, nums):
        if not nums: return 0
        dp = [1] * len(nums)
        for i in range(len(nums)):
            for j in range(i):
                if nums[i] > nums[j]:
                    dp[i] = max(dp[i], dp[j] + 1)
        return max(dp)

This solution uses a bottom-up approach. It creates a DP table with all elements initialized to 1 (since each individual number is a LIS). For each number, it checks all the numbers before it. If the current number is greater than the previous number, it might be a candidate for being part of an increasing subsequence. The DP table at the current index is updated to be the maximum of its current value or the value of DP at the previous index plus one. Finally, the maximum value in the DP table is returned, which is the length of the longest increasing subsequence. This solution has a time complexity of O(n^2) and a space complexity of O(n).

i can’t for the life of me get the top down solution to work. Keeps getting TLEd. Does anyone know what’s wrong with this code?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        memo = {}
        n = len(nums)
        
        def dp(idx: int, max_num: float):
            nonlocal memo
            key = (idx, max_num) 
            if idx == len(nums):
                return 0
            if key in memo: 
                return memo[key]
            curr = nums[idx]
            
            with = 0
            if max_num == None or curr > max_num: 
                with = dp(idx + 1, curr) + 1
            
            without = dp(idx + 1, max_num)

            memo[key] = max(with, without)
            return memo[key]
        
        return dp(0, None)

The time complexity of your approach is indeed too high for large inputs which is causing the TLE (Time Limit Exceeded) error. The problem is that for each number, you are considering two possibilities - whether to include it in the Longest Increasing Subsequence (LIS) or not. So, for n numbers, you are having to check 2^n possibilities. This is too much for larger inputs.

One way to optimize this is to use a bottom-up dynamic programming approach:

1
2
3
4
5
6
7
8
9
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        if not nums: return 0
        dp = [1] * len(nums)
        for i in range(len(nums)):
            for j in range(i):
                if nums[i] > nums[j]:
                    dp[i] = max(dp[i], dp[j] + 1)
        return max(dp)

In this approach, we initialize a dp array of size n (the length of nums) with all elements set to 1, because each number individually is an increasing sequence. Then for each number, we check all the numbers before it. If the current number is greater than the previous number, it might be part of an increasing subsequence. So, we update dp[i] to be the maximum of its current value and dp[j] + 1. At the end, we return the maximum value in the dp array, which represents the length of the longest increasing subsequence.

This algorithm has a time complexity of O(n^2) which is significantly better than the original approach.

Why do we set right = mid; instead of right = mid - 1; in binary search?

In binary search, setting right = mid; instead of right = mid - 1; depends on what we are trying to achieve.

If we are trying to find the exact element, we can use right = mid - 1; because in this case, if we’ve found the element, there’s no need to consider it during the next step of the binary search.

However, in the case of finding Longest Increasing Subsequence, we are trying to find a suitable position to place our current element. Even if we have found a position where tails[mid] == nums[i], we want to keep searching in the left part, which might have a smaller value. So we are not trying to discard the mid position right away, thus we use right = mid;.

Using right = mid; makes the search range decrease slower and this is useful when we want to find the first position that satisfies a certain condition, which is what we are doing in this problem. We are finding the first position in tails where the number is greater than or equal to nums[i], so that we can replace it.

is brute force even possible for this problem. Wondering what would be the time complexity?

Yes, a brute-force solution is possible for this problem. The idea would be to generate all possible subsequences of the array and then check which one is the longest increasing subsequence.

However, the time complexity of the brute-force solution is exponential. This is because, in the worst case, an array of size n can have 2^n different subsequences (including the empty subsequence). For each subsequence, we need to check if it’s increasing, which takes O(n) time. So in total, the time complexity is O(n * 2^n), which is not practical for large inputs. Therefore, a more efficient approach like dynamic programming or binary search is necessary for this problem.

So while a brute-force approach may be an initial step in understanding the problem, it’s not efficient or practical for larger inputs. The purpose of identifying a brute-force solution is usually to understand the problem better and then find ways to optimize from there.

Maintain an auxiliary array of arrays. You need to find the left most array in which you can insert you element. The only catch being, that each array must be in non-increasing order.

Yes, that’s a good way to explain the O(n log n) solution for this problem, which is using binary search and dynamic programming concepts.

Here’s how it works:

  1. You maintain a list of lists, where each inner list is an increasing subsequence.

  2. For each number in your input, you find the leftmost list where this number can be added while keeping the list sorted.

  3. If such a list is found, you append the number to that list. If not, you create a new list with this number.

  4. The binary search comes into play when finding the leftmost suitable list. Since the last elements of the lists are in increasing order, you can use binary search to find the right list to append to.

  5. After processing all numbers, the longest increasing subsequence would be the longest list in your list of lists. Note that there may be multiple longest subsequences, and this method only gives one of them.

The reason why each array (or list) is in non-decreasing order is to ensure that each list can be extended as much as possible. For each number, you always try to extend the longest list where it can be appended. If you can’t extend any existing list, then you start a new list. This ensures that the lists are as long as possible, and thus increases the chance of finding a long increasing subsequence.

This solution is efficient because it processes each number once, and uses binary search (which is log n) to find the right list to extend. Hence, the time complexity is O(n log n).

Does anyone has a brute force solution for this problem. I believe that the brute-force solution would has at least 3 for loops with the first 2 for loops are same as the 2 outer loop in 1st approach. The inner for loop would run logic that are similar the logic that update dp[] in 1st approach. I found writing the logic without using dp[] very complicating, but it could help with intuition to come up with the DP solution.

A brute force solution to this problem would indeed involve generating all subsequences of the given sequence, checking if each one is increasing and tracking the maximum length of such increasing subsequences.

Here is a Python solution which does exactly that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        n = len(nums)
        subsets = 1 << n  # There are 2^n possible subsets (subsequences)
        max_length = 0

        # Iterate over all subsets
        for i in range(subsets):
            subset = []
            for j in range(n):
                # Check if jth bit in the subset's binary representation is set
                if i & (1 << j):
                    subset.append(nums[j])
                    
            # Check if the subset is strictly increasing
            if len(subset) > 1:
                if all(subset[i] < subset[i+1] for i in range(len(subset)-1)):
                    max_length = max(max_length, len(subset))
            elif subset: # if subset has one element
                max_length = max(max_length, len(subset))
        
        return max_length

This code generates all subsequences of the nums list, then checks each subsequence to see if it’s strictly increasing. If it is, it updates max_length with the length of this subsequence if it’s longer than the current max_length.

This solution has a time complexity of O(2^n * n) because it needs to generate all 2^n subsets of the list, and for each subset, we check in O(n) time if it’s increasing. Therefore, it will indeed run into a Time Limit Exceeded error for large inputs.

This brute force solution, although not practical, does illustrate the basic idea of the problem: finding the longest increasing subsequence. However, it does not provide much insight into the more efficient dynamic programming or binary search solutions. It does, however, illustrate the concept of a subsequence and how to check if a sequence is increasing.

An improvement to the dynamic programming. We start constructing subarray from the tail. Every time we see a following number with a smaller value but a shorter length of the ascending subarray, we could simply ignore that number in the future (impossible to be the segment of the maximal subarray). How do we manage to delete this number in O(1) time? We could implement a linked list.

Implementing a linked list would not significantly improve the efficiency of the problem because we would still need to iterate over the entire list of numbers. Also, deletion in a linked list is an O(n) operation, not O(1), unless we maintain some kind of extra data structure to provide direct access to the elements.

The main problem here is that the optimal subarray ending at one position can change depending on what we see later in the list, and the standard dynamic programming approach already takes this into account. The standard dynamic programming approach has a time complexity of O(n^2), where n is the length of the input list.

If you’re looking for a more efficient approach, the one that uses binary search to find the position to update in the constructed subarray can solve the problem in O(n log n) time. In this approach, we maintain an array of the smallest ending values for all lengths of subsequences we’ve seen so far, and update it as we go through the list.

It’s worth mentioning that it’s often best to focus on conceptually clear and correct solutions first, especially in an interview setting, and only optimize if necessary and if it doesn’t overly complicate the code. A solution that is slightly less efficient but much more understandable and maintainable can often be the better choice.

IMHO, If num is greater than any element in sub, then add num to sub, here any element should be replaced by every element, or all elements, etc, not just any one of them.

I think there may be some confusion about the description. In the sentence “If num is greater than any element in sub, then add num to sub,” the word “any” should be understood in the context of the algorithm as meaning “the last element.”

In other words, if the current number, num, is greater than the last element of the auxiliary list sub, then we append num to sub. This is because we’re keeping track of the possible sequences and sub represents the current longest increasing subsequence that we have seen so far, and its last element is the highest value in this sequence.

So in this context, “any” doesn’t mean we compare num to all elements in sub, but specifically to the last element in sub. If num is greater, it means we can extend our current longest sequence, so we append it to sub.

‘we need a way to transition between states’. This summary is very important for solving dp problem.

Absolutely, you’re correct. One of the fundamental components of dynamic programming (DP) is the ability to define the relationship between different states, often referred to as the “state transition.”

In the context of DP, a “state” generally represents a subproblem, and the “transition” is how we go from one state (subproblem) to another. It essentially describes the way a solution to a given state is formed using the solutions from other states. Formally defining this relationship often helps in understanding the problem more clearly and implementing the DP solution effectively.

The state transition rule is a crucial part of many DP problems, because it describes how to avoid redundant work and solve each subproblem only once. As you’ve rightly noted, being able to identify and articulate this transition is a key step in mastering dynamic programming.

I tried solving with Longest common subsequence between array nums and sorted(nums) at O(n^2)

Your intuition is actually very good! The problem of finding the longest increasing subsequence (LIS) can be transformed into a problem of finding the longest common subsequence (LCS) between the array and a sorted version of the array. The LIS in the array is the LCS between the array and its sorted version.

Here is a python code implementing this idea:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        sorted_nums = sorted(set(nums))  # remove duplicates and sort
        m, n = len(nums), len(sorted_nums)
        dp = [[0]*(n+1) for _ in range(m+1)]
        
        for i in range(1, m+1):
            for j in range(1, n+1):
                if nums[i-1] == sorted_nums[j-1]:
                    dp[i][j] = 1 + dp[i-1][j-1]
                else:
                    dp[i][j] = max(dp[i-1][j], dp[i][j-1])
                    
        return dp[m][n]

This code first removes duplicates from the sorted list because the longest increasing subsequence should not have duplicates. Then it creates a 2D DP table where dp[i][j] stores the length of the longest common subsequence between the first ‘i’ numbers in ’nums’ and the first ‘j’ numbers in ‘sorted_nums’. It then fills in this table in a bottom-up fashion, comparing numbers from the original list and the sorted list.

While this approach is correct, its time complexity is O(n^2), which might be prohibitive for large inputs. The binary search approach can solve this problem in O(n log n) time, which is faster for large inputs. It is also more likely to be the approach your interviewer is looking for if asked in an interview setting.

In other words, while the LCS approach works and is a great demonstration of how different problems can be transformed and solved using similar techniques, there is a more efficient way to solve this specific problem.

For each element before i, check if that element is smaller than nums[i]. If so, set dp[i] = max(dp[i], dp[j] + 1). //I dont understand why dp[j]+1 aspect why would we increase value in the array ? Can someone please let me know ?

The reason we do dp[i] = max(dp[i], dp[j] + 1) is because we’re trying to form the longest increasing subsequence. In the dynamic programming array dp, dp[i] represents the length of the longest increasing subsequence ending at index i.

When we’re at index i, we want to check all previous indices j (where j < i) to see if nums[j] < nums[i]. If so, we can append nums[i] to the increasing subsequence ending at nums[j], effectively creating a new increasing subsequence. When we append nums[i] to the existing subsequence, the length of the subsequence increases by 1. That’s why we do dp[j] + 1.

However, there might be multiple such j indices that meet the condition, and appending nums[i] to different subsequences can result in sequences of different lengths. We want to ensure we’re always choosing the longest one. Therefore, we take the maximum of the current value of dp[i] and dp[j] + 1 to always keep the length of the longest possible increasing subsequence ending at i.

For example, consider the array nums = [1, 3, 2, 5] and let’s say we’re currently at index i=3 (nums[i] = 5). For all j < i, we check if nums[j] < nums[i]. Both nums[0] = 1, nums[1] = 3 and nums[2] = 2 are less than 5, so we have three potential sequences: [1, 5], [1, 3, 5], and [1, 2, 5]. The longest one is [1, 3, 5], which is what we obtain by doing dp[i] = max(dp[i], dp[j] + 1).

It can be solved in O(1) space and O(nlogn) time as approach 3 by using the already traversed nums instead of new sub array.

1
2
3
4
5
6
7
8
def lengthOfLIS(nums):
    i = -1
    for num in nums:
        index = bisect_left(nums, x=num, lo=0, hi=i+1)
        if index == i + 1:
            i += 1
        nums[index] = num
    return i + 1

Yes, you are correct! This approach still maintains the core principle of the binary search solution, but instead of creating a new sub array, it leverages the original array nums to store the “tails” of the sub-sequences.

The variable i is keeping track of the longest increasing subsequence so far. For each num in nums, we use the binary search function bisect_left to find where it fits in the sequence. If the index returned by bisect_left equals i + 1, it means num is larger than all elements in the current longest subsequence, so we increment i by 1.

Then, regardless of whether num extended the sequence or not, we replace nums[index] with num. This operation either extends the subsequence (when num is larger than any element in the current subsequence) or it might decrease the last element of an existing subsequence, which could potentially allow future elements to extend the subsequence.

Although the modification is in place and we don’t need extra space, it’s important to note that this doesn’t preserve the original nums array, which might or might not be a concern depending on the problem context.

Finally, the time complexity remains O(n log n) because for each element in nums, we perform a binary search, and the space complexity is indeed O(1) as the original array is being reused, and we’re not using any additional space proportional to the input size.

The auxiliary space is unnecessary.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        n = len(nums)
        slow = 0
        fast = 1
        while fast < n:
            if nums[fast] > nums[slow]:
                slow += 1
                nums[slow] = nums[fast]
            elif nums[fast] < nums[slow]:
                i = bisect.bisect_left(nums,nums[fast], hi = slow + 1)
                nums[i] = nums[fast]
            fast += 1
        
        return slow + 1

Your approach is quite interesting! You’re using two pointers (slow and fast) and moving through the nums array. You’ve also integrated binary search into your algorithm, which can improve its time complexity to O(n log n). Here’s how your code works:

  1. You initialize slow and fast pointers, where slow represents the last element of the current longest increasing subsequence (LIS), and fast scans through the nums array.

  2. For every nums[fast]:

    • If nums[fast] is larger than nums[slow], it means nums[fast] can extend the current LIS. Therefore, you increment slow and update nums[slow] to nums[fast].
    • If nums[fast] is smaller than nums[slow], it means nums[fast] might potentially start a longer LIS in the future. So, you find the appropriate place to insert nums[fast] in the LIS using bisect_left (which returns the index where to insert nums[fast] to maintain sorted order) and replace the existing element at that index with nums[fast].
  3. After the fast pointer has scanned all elements in nums, slow + 1 gives the length of the LIS.

This approach indeed avoids additional space usage as you are not creating a separate array for LIS. However, it modifies the original nums array. If maintaining the original array’s integrity is crucial, this could be a drawback. But for problems where it’s not a concern, this approach works efficiently and is a clever way to reduce space usage.

Upgrade to Solution 1 which is using a SparseTable to find the first minimum element between 0 and I reducing Time&Space to O(nlog(n) avg (worse case is still N^2).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        if len(nums) == 1:
            return 1
        ### sparse table O(nlog(n)) time & space
        sp = [[float('inf') for j in range(math.ceil(math.log(len(nums))) + 1)] for _ in range(len(nums))]
        
        # building sparse table, we can answer what's the minimum of range [i:j] on O(1)
        # after some processing which takes only once for O(Nlog(N))
        for i in range(len(sp)):
            sp[i][0] = nums[i]
        for j in range(1, math.ceil(math.log(len(nums)) + 1)):
            i = 0
            while(i + 2**(j)-1 < len(nums)):
                sp[i][j] = min(sp[i][j-1], sp[i + 2**(j-1)][j-1])
                i += 1
                
        # O(1) - get minimum between some L and R
        def get_min(l, r):
            # we need to search for the index j such that
            # minimum of the range is less than current element
            k = r - l + 1
            k = math.ceil(math.log(k))
            return min(sp[j][k],sp[i - (2**k) + 1][k])
            
                
        dp = [1] * len(nums)
        for i in range(len(nums)):
            # get the index of the first element that is less than nums[i]
            idx = 0
            l = 0
            r = i
			# log(n)
            while l < r:
                mid = (l + r) // 2
                if nums[i] > get_min(l, r):
                    idx = mid
                    r = mid 
                else:
                    l = mid + 1
            for j in range(idx, i):
                if nums[i] > nums[j]:
                    dp[i] = max(dp[j] + 1, dp[i])
        return max(dp)        
        
        return best 

Your solution is innovative and more optimized in terms of time complexity. The idea is to construct a Sparse Table which can answer queries about the minimum element in a range [l, r] in O(1) time after pre-processing. Here’s a more detailed explanation:

  1. Initialize the Sparse Table (sp), which is a 2D matrix. Every element is set to infinity, and the size of the table is n x log(n).

  2. Fill the Sparse Table using a bottom-up approach. The base case is when the range length is 1, where the minimum is the number itself. For ranges with length greater than 1, the minimum is calculated as the smaller value between the two overlapping sub-ranges of half the length.

  3. The get_min(l, r) function is used to get the minimum value in the range [l, r]. It uses the Sparse Table to do this in O(1) time.

  4. For each number in nums, you perform a binary search to find the index of the first number that is less than the current number using the get_min(l, r) function. This takes O(log n) time.

  5. Update the dp array with the maximum length of the increasing subsequence ending at i.

  6. The final answer is the maximum value in the dp array.

Your solution does a great job optimizing the time complexity to O(n log n) on average, with a worst-case time complexity of O(n^2) due to the loop that updates the dp array. The space complexity is also O(n log n) due to the Sparse Table. However, it might be a bit complex to explain during an interview, and the Sparse Table concept could be unfamiliar to some interviewers. Make sure you can explain the Sparse Table construction and query process clearly if you choose to use this solution.

Explanation of how to understand if this is a dp problem and how to solve it after that is a really helpful way to learn dp. Thanks!

Dynamic Programming (DP) can indeed be a bit daunting at first, but once you get the hang of it, it becomes a powerful tool to solve complex problems.

In general, you can identify a problem as a potential DP problem if it has two key properties:

  1. Optimal Substructure: This means that an optimal solution to the problem can be constructed efficiently from optimal solutions of its subproblems. In other words, if you have the optimal solutions to the smaller problems, you can use them to find the optimal solution to the larger problem. In the case of the Longest Increasing Subsequence (LIS) problem, we used solutions to smaller LIS problems (those ending at each index before the current one) to build up the solution to the larger problem.

  2. Overlapping Subproblems: This is when a recursive algorithm would visit the same subproblems repeatedly. In such cases, DP is used to store the results of those subproblems in a table so that each one is only solved or computed once. For the LIS problem, if we used a naive recursive approach, we would end up solving the same subproblem (of finding the LIS ending at a certain index) multiple times. By using DP, we avoid this redundant computation by storing the result the first time we compute it.

Understanding these concepts can make it easier to recognize and solve DP problems in the future. Remember, solving DP problems often involves defining the state (what you need to remember), the base case (the simplest subproblem you can solve directly), the transition (how to go from one state to another, or how to use the solutions to smaller subproblems to solve the current one), and the order of iteration (in what order to solve the subproblems).

Keep practicing, and you’ll continue to get better at these types of problems!

For each element before i, check if that element is smaller than nums[i]. If so, set dp[i] = max(dp[i], dp[j] + 1).//I dont understand why dp[j]+1 aspect. Beause my thinking is that by adding dp[j]+1, we are introducing a value that wasnt there

Your confusion is understandable. Let’s clarify what’s going on here.

In the context of this problem, dp[i] represents the length of the longest increasing subsequence (LIS) ending at position i. When we are at a given index i, we consider all indices j before i. If nums[j] < nums[i], it means the sequence can be extended by nums[i] after nums[j].

Remember, dp[j] itself is the length of the longest increasing subsequence ending at j. If nums[i] is greater than nums[j], then we can append nums[i] to the sequence ending at j, thus increasing the length of this sequence by 1. Hence, the new length of the LIS ending at i could potentially be dp[j] + 1.

In other words, dp[j] + 1 represents the length of a new potential sequence ending at i, created by extending the LIS that ended at j with nums[i].

But, we’re not just automatically setting dp[i] = dp[j] + 1. Instead, we’re using the max function because we’re interested in the longest possible subsequence. So, we only update dp[i] with dp[j] + 1 if dp[j] + 1 is greater than the current dp[i].

This is the beauty of dynamic programming: you’re using solutions to smaller subproblems (the longest increasing subsequences ending at each position j before i) to solve the current subproblem (the longest increasing subsequence ending at i).

This is Exercise 15.4-6, Page 397 from CLRS. C++ implementations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
class Solution {
private:
    int find(vector<int> c, int right, int x) { // binary search
        int left = 0, mid;

        while (left <= right) {
            mid = left + (right - left) / 2; // mitigates integer overflows
            if (x > c[mid])
                left = mid + 1;
            else if (x < c[mid])
                right = mid - 1;
            else
                return mid;
        }

        return left;
    }
public:
    int lengthOfLIS(vector<int>& nums) {
        int best = 0, j;
        vector<int> c(nums.size() + 1);
        c[0] = -10001; // forall i, nums[i] >= 10^4

        for (int num: nums) {
            j = find(c, best, num);
            c[j] = num;
            if (j > best) best = j;
        }

        return best;
    }

    int lengthOfLIS2(vector<int>& nums) {
        if (nums.size() == 1) return 1;

        vector<int> dp;

        for (int x : nums)
            // add elements in strictly increasing order
            if (dp.empty() || dp.back() < x) dp.push_back(x);

            // find index of smallest number >= x and replace that number with x
            else *lower_bound(dp.begin(), dp.end(), x) = x; // binary search

        return dp.size();
    }
};

Your C++ code is correct and implements the same algorithm that we’ve been discussing. It determines the length of the Longest Increasing Subsequence (LIS) in an array using binary search, which has a time complexity of O(n log n).

Here are a few additional explanations about your code:

  1. find() function: This function implements binary search on the array c. It searches for the index at which x can be placed such that the array remains sorted. If x is already present, it returns the index of x.

  2. lengthOfLIS() function: Here, you initialize c with an array size of nums.size() + 1 and c[0] with -10001 which is a smaller value than any value in nums. Then you iterate over nums and for each number, you find the index j at which it should be placed in c to maintain the sorted order. If j > best, then you update best with j.

  3. lengthOfLIS2() function: This function implements a slightly different approach, it uses std::lower_bound() to find the position at which x can be placed in dp. If dp is empty or x is greater than the last element of dp, you push x into dp. Otherwise, you replace the smallest number that is greater than or equal to x in dp with x.

In both functions, the length of the LIS is the size of the dp vector.

Can someone explain and provide an example why we bother with insertions in the middle of the array? I get why you would always want to replace the last element with a better alternative (so that it’s easier to append a bigger number later). But if you only append an element after comparing to the last element, why bother optimizing the middle elements? If you could avoid replacing the middle elements you could recover the actual sub array, which you can’t when you update the middle elements.

Great question! This is a subtlety in the algorithm that might not be immediately clear. It actually has to do with the potential for future numbers in the array.

When we insert in the middle, we’re saying that we’ve found a smaller number that could potentially create a longer increasing subsequence in the future. In other words, replacing the higher number with the smaller one increases our chances of finding a longer subsequence.

Let’s take an example. Suppose we have the following sequence:

[1, 3, 5, 2, 8, 9]

And our current subsequence is [1, 3, 5]. When we see the 2, it’s not greater than the last element in the subsequence, 5, so it cannot be appended at the end. However, if we replace the 3 with 2, we get the subsequence [1, 2, 5].

Then, when we see the 8, it’s greater than the last element in the subsequence, 5, so it can be appended at the end, and we get [1, 2, 5, 8].

Now, let’s consider the case where we didn’t replace the 3 with 2. We’d still have the subsequence [1, 3, 5], and when we see the 8, it would be appended to the end, resulting in [1, 3, 5, 8]. So it looks like we’ve got the same result, right? But now, let’s consider what happens when we see the 9. If we replaced the 3 with 2, we’d be able to append the 9 to get [1, 2, 5, 8, 9]. But if we didn’t replace the 3 with 2, we’d only be able to append the 9 to get [1, 3, 5, 8, 9], which is not the longest increasing subsequence.

So, replacing elements in the middle of the array gives us the flexibility to create the longest increasing subsequence, even though it means we cannot recover the actual subsequence. If recovering the actual subsequence is required, a different approach would be needed.

Sure, let’s apply the binary search approach for the given list of numbers nums = [1, 10, 11, 12, 5, 6, 7, 8].

The dp array represents the smallest tail elements for all increasing subsequences of the given list nums. It starts as an empty list.

We start with the first element, 1, and append it to the dp array. So the dp array becomes: [1].

Then we move on to the next element, 10, which is greater than the last element in the dp array, so we append 10 to the dp array. Now dp becomes: [1, 10].

We repeat this process for 11 and 12, so the dp array becomes: [1, 10, 11, 12].

The next number in the list is 5, which is not greater than the last element of the dp array (12). Here, we use binary search to find the correct position to replace in the dp array. The position for 5 should be 1 (considering 0-indexing), since 5 is greater than 1 (at 0-index) and less than 10 (at 1-index). So 5 replaces 10, and the dp array becomes: [1, 5, 11, 12].

We continue this process for 6, 7, and 8, ending up with the dp array: [1, 5, 6, 7, 8].

Note that the dp array is not necessarily one of the longest increasing subsequences, but the length of the dp array gives us the length of the longest increasing subsequence in the list. In this case, the longest increasing subsequence is of length 5.

Please also note that there might be multiple valid longest increasing subsequences, for example, [1, 10, 11, 12, 13] is also a valid longest increasing subsequence of the given list.

While the optimal solution is simple for implementation but not intuitive enough, at least for me. There is a simpler way to think about it. For each element, you can build a sorted list of all pervious ones. You need to do a insert like insertion sort so that you can get the current length of the current subsequence. The issue is insertion is O(n), but if you can use a binary search tree, you will make it O(lgn). This is not better than 3rd one but to me, it’s easier to think of. It’s hard for implementation but the idea self explains.

Your intuition is absolutely correct! Your approach is essentially a slight variant of the optimal solution.

Here is how your approach would work:

  1. Initialize an empty list or binary search tree (BST) to hold the elements of the longest increasing subsequence (LIS).

  2. For each element in the input list, use the BST to find the correct position to insert the element. The BST keeps the elements in sorted order and allows for binary search, reducing the time complexity of insertion from O(n) to O(log n).

  3. Insert the element into the BST at the appropriate position. If the element is larger than all elements in the BST, add it to the end. If not, replace the first element in the BST that is larger than the current element. This ensures that the BST always contains the smallest tail elements of all possible increasing subsequences.

  4. The size of the BST at any time gives the length of the LIS up to that point.

The resulting time complexity is indeed O(n log n), same as the optimal solution. This is because for each of the n elements, we perform a binary search operation and an insert operation, both of which take O(log n) time in a BST. However, keep in mind that the actual implementation of this BST-based approach could be more complex than the array-based approach.

Regarding intuition, you can think of it like this: for each element, we are trying to extend the longest subsequence we can. However, if we cannot, we try to improve our subsequences so far, by replacing the first element larger than the current one. This keeps our subsequences optimal, as they have the smallest possible tail elements, making it easier to extend them in the future. This approach is somewhat greedy but works in this context.

Python example of df, slightly different than accepted, added comments for intuition on solution

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        # we have two decisions at each nums value
        # add to sequence or don't add to sequence
        # we can build the whole tree with sub problems and 
        # take the maximum value of the substring
        dp = [1] * len(nums)
        max_len = 1
        # want to iterate through all the possible ending positions, the subproblems
        for i in range(1, len(nums)):
            # want to parse up to the ith character adding values less along the way
            for j in range(i):
                i_val = nums[i]
                j_val = nums[j]
                # we use our memo here to test if we should add this character based off previous values
                # or not
                if i_val > j_val:
                    # if i is greater than that value we can add
                    # so we take the maximum sub_string length here we have found
                    # this ensures we are getting the maximum sub-string
                    dp[i] = max(dp[i], dp[j] + 1)
                    if dp[i] > max_len:
                        max_len = dp[i]
        # we can then return the maximum substring through the list
        return max_len

Thank you for sharing your version of the dynamic programming solution for this problem. It is quite clear and well explained. Here’s a more detailed breakdown of what’s happening in the code:

  1. Initialize a list dp of length len(nums), and set all elements to 1. This list represents the longest increasing subsequence (LIS) ending at each index. Initially, we assume the LIS is only the number itself, hence all elements in dp are 1.

  2. Initialize a variable max_len to keep track of the maximum length of the LIS found so far.

  3. Iterate over the list of numbers. For each number nums[i] (where i ranges from 1 to len(nums) - 1), we compare it with all the numbers before it.

  4. For each comparison with a number nums[j] (where j ranges from 0 to i - 1), if nums[i] is greater than nums[j], it means nums[i] can extend the LIS ending at nums[j]. Therefore, we update dp[i] to be the maximum of its current value and dp[j] + 1.

  5. If dp[i] is greater than max_len, update max_len to be dp[i].

  6. Finally, return max_len, which is the length of the longest increasing subsequence in the input list.

This solution has a time complexity of O(n^2) due to the two nested loops, where n is the length of the input list. The space complexity is O(n), used by the dp list.

Overall, your code does an excellent job of solving the problem!

I understand the second solution in this way. Imagine you are arrange the queue ascendingly. When a new guy i comes, if he is greater than some previous guys, just branch the queue at the first previous guy j who is greater than i, make a new queue. The two queues share the same prefix. Now this branched queue’s length is not greater than the old one. But this gives the possibility for the following new guys that only greater than i but less than j to add to the branched queue. So on and so forth. As for replace the guy j by guy i, it is just in order to maintain a single data instead of tow or three…, since what we concern about is the length. Whose length is the longest. Unless there is a new guy greater than the last guy in the old queue, the length of the old queue will not change. At this scenario, (a new guy is greater than the last gut of the old queue), it will add to the old queue making the length greater by 1.

Your understanding of the problem is mostly correct and provides a nice analogy. Let’s refine it a bit:

This problem can be visualized as arranging people in a queue based on their height in ascending order. Each person (or number in the array) comes one by one, and you are trying to find the longest queue that maintains this ascending property.

  1. When a new person i comes, if they are taller than everyone already in the queue (their height is greater than all the previous numbers), they stand at the end of the queue. This scenario extends the queue by one.

  2. If the new person is not the tallest (i.e., nums[i] is not the largest number so far), they can potentially start a new queue or fit into an existing one. Specifically, we replace the first person who is taller than i in the queue with i. This is equivalent to maintaining the tails array in the second solution, where tails[i] is the smallest tail of all increasing subsequences of length i+1.

    Why replace instead of creating a new queue? Because we’re only interested in the lengths of the queues (or the length of the longest increasing subsequence), not the actual people (numbers) in them. Replacing doesn’t change the length of the queue, but it provides an opportunity for shorter people (smaller numbers) coming next to extend this queue.

This analogy makes it easier to understand why the second solution works and why it is an improvement over the simple dynamic programming solution. It minimizes the operations needed by keeping track of the smallest possible ending number (i) for all increasing subsequences of the same length.

Another way to develop an intuition about the 2nd approach to “feel” why it’s correct: we try to build the longest sequence, but we don’t really care about the sequence itself, only it’s length. So in a way, we are free to reuse the same array by building alternative sequences right inside it. In theory, we could maintain multiple arrays of all possible sequences, but since we only care about the longest one, and only about it’s length, they can all override each other inside a single array.

Yes, exactly! You’ve articulated it well. In the second approach, we’re using a single array tails to keep track of multiple potential increasing subsequences, but the key insight is that we’re only interested in the length of the longest one, not the actual sequences themselves.

This is why we replace the smallest number tails[i] that is larger than our current number nums[j] rather than keeping separate arrays for all potential subsequences. By keeping tails sorted at all times, we ensure that we are always working with the potential longest subsequence at any point in time.

This optimization reduces the space complexity from O(N^2) (where N is the size of the input array), which would be the case if we kept all potential subsequences, down to O(N). At the same time, it enables us to use binary search for each update, reducing the time complexity from O(N^2) to O(N log N).

Let’s say we have another problem, with exactly same same description, but this time we want to return any one of those longest increasing subsequences, then what will be the optimal approach? This won’t work because it does not construct valid subsets all the time, just the size is always depicting longest length.

Indeed, the approach described in this thread would not give you a longest increasing subsequence (LIS) itself, just its length. If you want an actual longest increasing subsequence, you will have to adjust the approach slightly.

The idea is to maintain an array of increasing subsequences (like before), but also to keep track of a parent pointer array that will help you to construct the actual longest increasing subsequence. Here is a python approach with O(n log n) time complexity:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from typing import List
import bisect

def lengthOfLIS(nums: List[int]) -> List[int]:
    tails = [0] * len(nums)
    parents = [-1] * len(nums)  # Parent pointers
    size = 0

    # Subsequences ending at each position
    subsequences = [0] * len(nums)

    for i, num in enumerate(nums):
        # Binary search
        j = bisect.bisect_left(tails, num, 0, size)
        tails[j] = num
        # Update parents
        parents[i] = subsequences[j - 1] if j > 0 else -1
        # Update subsequences
        subsequences[j] = i

        # Check if we found a new longest subsequence
        if j == size:
            size += 1

    # Reconstruct the longest increasing subsequence
    k = subsequences[size - 1]
    lis = []
    while k >= 0:
        lis.append(nums[k])
        k = parents[k]

    return lis[::-1]  # Reverse the LIS

This solution maintains an array tails to track the potential end element of the longest subsequence found so far. The parents array is used to store the index of the predecessor element for nums[i] in its longest subsequence. The subsequences array holds the index in nums of the smallest tail element for all increasing subsequences of length i+1.

Finally, to generate one of the longest increasing subsequences, the code starts from the last element added to tails and works its way backwards using the parents pointers. This generates the longest increasing subsequence in reverse, which is then reversed again to give the final answer.

Isnt brute force O(N Sqaure). Loop through every subarray and find the longest increasing subsequence.

The brute force approach would involve generating all possible subsequences of the given array and then checking each one to see if it’s increasing and keeping track of the longest one. This actually has a time complexity of O(2^N), not O(N^2), because there are 2^N possible subsequences of an array (including the empty subsequence and the array itself).

Each subsequence can be represented by a binary string of length N, where the i-th digit is 1 if the i-th element is included in the subsequence, and 0 if it isn’t. For an array of length N, there are 2^N such binary strings.

For example, with the array [1, 2, 3], we have 2^3 = 8 possible subsequences:

  • []
  • [1]
  • [2]
  • [3]
  • [1, 2]
  • [1, 3]
  • [2, 3]
  • [1, 2, 3]

So you can see why this brute force approach quickly becomes infeasible for large inputs, due to its exponential time complexity. In comparison, the dynamic programming solution is much more efficient with a time complexity of O(N^2), and the optimized solution has a time complexity of O(N log N).

I have a question about the time complexity of the method with Binary Search. I think the time complexity should be N (loop through all the elements) * log(N) (do binary search) * N (insertion in a c++ vector at some place, which moves all the elements, up to N, after the inserted one)

Your understanding is correct if we consider the time complexity of insertion in a dynamic array or C++ vector. However, the method utilizing binary search in the Longest Increasing Subsequence problem does not actually perform the physical insertions.

In the binary search solution, it maintains a vector (let’s call it tails) where tails[i] is the smallest tail element of all increasing subsequences of length i+1 found so far. It doesn’t hold the actual subsequences, just the smallest tail elements.

When it iterates over the input array, it tries to find a position in tails to replace with the current element. This replacement does not require shifting elements in the array, which would indeed be O(N) operation. Instead, it directly replaces the value, which is an O(1) operation.

Hence, the overall time complexity of the method is O(N log N), because for each of the N elements in the input array, a binary search operation taking O(log N) time is performed, and then an O(1) replacement operation. So it does not have the extra factor of N you suggested.

if the input is nums = [10,9,2,4,5,3,7,101,18] the result of sub = [2,3,5,7,18], but this is not the subsequence we are looking for, we ware looking for [2,4,5,7,18] even though the final result for this question is correct What if the question is asking to return the subsequence, what should we do?

Indeed, the optimized approach using dynamic programming and binary search does not directly track the actual longest increasing subsequence. It just keeps track of possible increasing subsequences and yields the length of the longest one.

If the problem were to return the actual longest increasing subsequence, the optimized approach wouldn’t directly work. We’d need to modify our approach slightly to keep track of the actual subsequences.

One approach could involve keeping track of the parent pointers. Here’s an example of how you could modify the dynamic programming approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def lengthOfLIS(nums):
    if not nums:
        return []
    
    # Initialize parents array and dp array
    parents = [0]*len(nums)
    dp = [1]*len(nums)

    max_length = 1
    max_length_index = 0

    # Build dp array and keep track of parent pointers
    for i in range(len(nums)):
        parents[i] = i
        for j in range(i):
            if nums[i] > nums[j] and dp[i] < dp[j] + 1:
                dp[i] = dp[j] + 1
                parents[i] = j
                if dp[i] > max_length:
                    max_length = dp[i]
                    max_length_index = i

    # Reconstruct the longest increasing subsequence
    LIS = []
    while len(LIS) < max_length:
        LIS.insert(0, nums[max_length_index])
        max_length_index = parents[max_length_index]

    return LIS

In this code, a parents list is kept which keeps track of the previous index for each position in the longest increasing subsequence. After the dp list is built, the longest increasing subsequence is reconstructed by backtracking according to the parents list. Note that this approach has a time complexity of O(N^2) and it cannot be easily combined with the binary search optimization.

Can anyone tell me why we return left for the binary search?

In the binary search method, we return left because left is the position where the current number (num) should be placed according to its order in the sequence.

Let’s understand it a little deeper. Binary search in this context is used to find a position for num such that all the numbers before it are less than num and all the numbers after it are greater than or equal to num. This ensures that we’re building a longest increasing subsequence.

When the binary search ends, left will point to the smallest number which is equal to or greater than num. So, the current number (num) should replace that smallest number (nums[left]) which is not less than num, to maintain the condition of longest increasing subsequence. That’s why we return left as the correct position for num.

This approach effectively maintains the smallest tail for all increasing subsequences’ lengths, which ensures the longest possible increasing subsequence can be found.

Binary Search solution can easily be replaces with TreeSet:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class Solution {
    public int lengthOfLIS(int[] nums) {
        TreeSet<Integer> set = new TreeSet<Integer>();
        set.add(nums[0]);
        for ( int i = 1 ; i < nums.length; i++ ) {
            int num = nums[i];
            if ( num > set.last() ) {
                set.add(num);
            }
            else {
                int smalledElementGreaterThanNum = set.ceiling(num);
                set.remove(smalledElementGreaterThanNum);
                set.add(num);
            }
        }
        return set.size();
    }
}

Yes, that’s correct! The TreeSet data structure in Java is a red-black tree based NavigableSet implementation. It keeps the elements in the set in sorted ascending order and provides operations like “ceiling” and “floor” in logarithmic time, similar to binary search.

In your solution, you are using the TreeSet to maintain a set of numbers where each number is the smallest tail number of all increasing subsequences with length i+1 at the step i. This allows you to maintain the minimum tail for each length of the increasing subsequences, which is the key to solving this problem.

Here’s how your code works:

  1. Initialize a TreeSet and add the first number to it.
  2. Iterate over the numbers in the input array:
    • If the current number is greater than the last number in the set, add it to the set.
    • If not, find the smallest number in the set that is greater than or equal to the current number, remove it from the set, and add the current number. This operation maintains the condition that for each i, the number at position i in the set is the smallest tail number of all increasing subsequences with length i+1.
  3. After going through all the numbers, the size of the set is the length of the longest increasing subsequence.

This solution indeed takes advantage of the properties of TreeSet and provides an elegant way to solve this problem. However, it’s worth noting that this solution works only in languages where such a data structure is available, like Java. Also, the solution may not be the most efficient in terms of time complexity due to the remove operation, which also takes O(logn) time, leading to an overall time complexity of O(nlogn).

Can anyone explain in the binary search which gives the upper_bound, Why right = mid works but not right = mid - 1

The function implemented in the binary search part of the problem isn’t strictly a classical binary search, but rather a variant that returns the position to insert an element while maintaining sorted order (similar to the bisect_right or upper_bound functions in Python or C++). The returned position is the first element that is strictly greater than the target number.

The reason right = mid works but right = mid - 1 doesn’t in this case is because we’re not certain that the element at index mid is not the position we want to insert the element at. If nums[mid] < x, then indeed mid is not the insertion point. But if nums[mid] == x, then mid could be the insertion point (if there’s no element greater than x to its right) or the insertion point could be to its right (if there’s an element strictly greater than x).

In short, right = mid works because we may still want to consider mid as a possible insertion point, while right = mid - 1 excludes that possibility.

Please note that this kind of binary search variant is designed for sorted sequences without duplicate elements. If there are duplicates, it finds the rightmost position. If you want to find the leftmost position in case of duplicates, you would have to adjust the binary search algorithm to do so.

is it true that for the subsequence generated, the ith(1 based index) position marks the end of LIS with i elements?

Yes, it is true that in the binary search approach, the ith element (1-based index) of the subsequence is the smallest ending element of any increasing subsequence of length i. It is important to note that this subsequence isn’t necessarily one of the longest increasing subsequences, but a collection of potential tails for the subsequences of various lengths.

Therefore, it’s not a true subsequence in the traditional sense, but a data structure to keep track of the potential tails. This allows the algorithm to extend a potential longest subsequence whenever it encounters a new number that is larger than any number it has encountered before.

If you actually wanted to construct one of the longest increasing subsequences, you would need a different approach, since the binary search solution only keeps track of the lengths of these subsequences, not the subsequences themselves.

It could also be O(1) space complexity if we will make “sub” in-place

Yes, it’s possible to use the input array itself to track the minimum ending values of increasing subsequences, thus achieving O(1) additional space complexity. The principle is the same - maintaining the tails of potential increasing subsequences - but it’s implemented directly on the input array.

Here’s how it could look in Python:

1
2
3
4
5
6
7
8
9
class Solution:
    def lengthOfLIS(self, nums: List[int]) -> int:
        length = 0
        for num in nums:
            i = bisect.bisect_left(nums, num, 0, length)
            nums[i] = num
            if i == length:
                length += 1
        return length

This modifies the original nums array during processing. length is tracking the length of the longest increasing subsequence found so far. By the end of the loop, nums[:length] will hold the minimum possible ending values for all increasing subsequences of length 1, 2, ..., length.

Remember that nums[:length] does not represent an actual longest increasing subsequence in the original array, but rather a collection of the smallest end values of increasing subsequences of different lengths.

Indeed, you’ve captured the main concept well! The binary search approach to solving the longest increasing subsequence problem maintains the length of the longest subsequence found so far, but doesn’t necessarily keep the correct order of elements for that subsequence.

To expand a bit: the binary search method maintains an array of the smallest possible ending elements for all increasing subsequences of lengths 1 through n encountered so far. We iterate through the input array, and for each element, we find a position in our array to replace an existing value with the current one (if possible). If the current element is larger than all elements in our array, we append it, extending our longest subsequence.

The clever part is that we use binary search to quickly find the correct position to replace. This keeps the method efficient, but as a trade-off, the array of smallest possible ending elements isn’t a valid subsequence in the original input.

The key understanding is that while we’re not preserving the actual longest increasing subsequence, we are preserving its length, which is the information asked for by the problem.

Implemented in TreeSet

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
  public int lengthOfLIS(int[] nums) {
    TreeSet<Integer> dp = new TreeSet<>();
    dp.add(nums[0]);
    for(int i = 1; i < nums.length; i++) {
      if(nums[i] > dp.last()) {
        dp.add(nums[i]);
      } else {
        Integer replace = dp.ceiling(nums[i]);
        dp.remove(replace);
        dp.add(nums[i]);
      }
    }
    return dp.size();
  }

The implementation you’ve provided here in Java utilizes a TreeSet, which is a self-balancing binary search tree. This allows for efficient addition, removal, and ceiling operations, each taking approximately O(log n) time. This makes your solution run in approximately O(n log n) time overall, which is very efficient for this problem.

Here’s a brief rundown of what your implementation does:

  1. Initialize the TreeSet dp and add the first element of nums into dp.
  2. Iterate through the remaining elements in nums.
  3. If the current element is larger than the last element in dp, add the current element to dp.
  4. If the current element is smaller or equal, remove the smallest element in dp that is larger or equal to the current element (dp.ceiling(nums[i])), then add the current element into dp. This step essentially replaces the smallest larger number in dp with the current number, ensuring that the elements in dp are in increasing order, and dp is always as long as possible.
  5. Return the size of dp, which represents the length of the longest increasing subsequence.

The intuition behind this solution is similar to the binary search solution, where dp keeps track of the potential end elements of all possible LIS’s encountered so far. It makes use of TreeSet’s self-balancing binary search tree properties to perform efficient additions, removals and searches, thus making the algorithm very efficient.

I think I can explain why the Greedy with Binary Search solution really works. It works because the array is storing at index i, the smallest number that ends a valid increasing subsequence of length i + 1. That’s all that matters since keeping track of the smallest number that ends a valid increasing subsequence opens up the greatest possibilities for extending the subsequence.

Using the same example provided above: Let’s run that example nums = [2, 6, 8, 3, 4, 5, 1] again: Let pick the first element, sub = [2].

6 is greater than previous number, sub = [2, 6], 6 extends the longest subsequence. Now, the minimum the incoming numbers must be greater than to extend a subsequence of length 2 is now 6. 8 is greater than previous number, sub = [2, 6, 8] Now, the minimum the incoming numbers must be greater than to extend a subsequence of length 3 is now 8. 3 is less than previous number, so we can’t extend the subsequence sub. We need to find the smallest number >= 3 in sub, it’s 6. Then we overwrite it, now sub = [2, 3, 8]. Since [2,3] is a valid increasing subsequence, we can lower the smallest number that ends a valid increasing subsequence of length 2 to 3. 4 is larger than 3, which is at index 2 of sub. Hence it can form a increasing subsequence of length 3. Since [2,3,4] is a valid increasing subsequence, so we can lower the smallest number that ends a valid increasing subsequence of length 3 to 4. Now sub = [2,3,4] 5 is greater than 4, sub = [2, 3, 4, 5]. 1 greater than no other element in sub. We overwrite 2 by 1 as [1] is a smaller increasing subsequence of length 1 than [2], so sub = [1, 3, 4, 5]. Finally, length of longest increase subsequence = len(sub) = 4.

Yes, your understanding of the Greedy with Binary Search approach for finding the longest increasing subsequence (LIS) is absolutely correct! This approach efficiently maintains a list of potential candidates for extending the LIS.

The key insight is that at any point during the algorithm, the sub list at index i maintains the smallest tail element of all increasing subsequences of length i+1 found so far.

This approach cleverly maintains the “smallest possible tail” for all increasing subsequences of a certain length. That’s why when we encounter a new number from the original list, we replace the first number in sub that is greater or equal to it. This ensures that sub continues to maintain the smallest possible tails, which opens up more possibilities for future numbers to extend the subsequences.

This is a smart way of keeping track of all promising subsequences at once, without needing to remember the entire subsequences themselves. It gives us the length of the longest increasing subsequence, without necessarily providing the exact sequence. This approach achieves an optimal time complexity of O(N log N) which is a significant improvement over the naive O(N^2) dynamic programming solution.

Your walkthrough with the provided example illustrates these points perfectly. It’s always good to thoroughly understand not just how an algorithm works, but why it works, as this deepens your understanding and can help you come up with similar solutions to other problems in the future.

I was only enable to solve this only DP solution Please check my solution

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class Solution {
public:
    int lengthOfLIS(vector<int>& nums) {
       vector<int> temp(nums.size(),1);
        int len = nums.size();
        int ans=1;
        
        for(int i=0;i<len-1;i++){
            for(int j=i+1;j<len;j++){
                if(nums[i]<nums[j] && temp[i]>=temp[j]){
                    temp[j] = temp[i]+1;
                    ans = max(ans,temp[j]);
                }
            }
        }
        
        return ans;
    }
};

Your solution is a standard implementation of dynamic programming for the longest increasing subsequence problem, and it’s correct!

In this solution, you create a temp vector that stores the length of the longest increasing subsequence ending at each index. This vector is initialized with 1s, since a single element is itself a sequence of length 1.

The outer loop traverses the array from left to right, and for each element, the inner loop checks all elements to its right. If the current element from the outer loop is smaller than an element in the inner loop and the longest increasing subsequence ending at the current element is greater or equal to the longest subsequence at the element in the inner loop, then we can extend the subsequence. So, we update temp[j] with temp[i] + 1.

The ans variable keeps track of the maximum length of an increasing subsequence found so far.

Your solution has a time complexity of O(n^2), where n is the size of the input array nums. This is because for each element, you’re checking all elements to its right. It’s not the most efficient solution, but it correctly solves the problem.

If you’re interested in a more efficient solution, you could look into using binary search to reduce the time complexity to O(n log n). But your understanding and implementation of the dynamic programming solution is great!

how to find longest subsequence with greatest lexicographical order with BIT

In your question, you’re asking how to find the longest increasing subsequence (LIS) with the greatest lexicographical order using Binary Indexed Tree (BIT).

To achieve this, it’s important to remember that a Binary Indexed Tree (BIT), also known as a Fenwick Tree, is a data structure that can efficiently update elements and calculate prefix sums in a table of numbers.

Here is a rough sketch of how you might approach this:

  1. Sort the numbers: You’ll need to sort the numbers but still remember their original indices. You can do this by using a pair of (value, index). Then, sort the pairs.

  2. Use a Binary Indexed Tree: After sorting, iterate over the pairs in decreasing order of their values (to prioritize greater lexicographical order). For each pair, use the BIT to find the current longest subsequence that ends with an index less than the index of the current pair. Add one to this length, then use the BIT again to update the length of the longest subsequence ending at the current index.

  3. Find the maximum length: At the end, the maximum length stored in the BIT is the length of the longest increasing subsequence with the greatest lexicographical order.

Please note that this is a high-level idea and the actual implementation could be more complex depending on the specific details of your problem. Also, handling the lexicographical order might require extra care in your algorithm.

please tell me why does sorting the given array and making a copy of the original array and applying the Longest common subsequence length code doesn’t work here ?

That’s a good question. The reason why finding the Longest Common Subsequence (LCS) of a given array and its sorted version doesn’t always give the correct length of the Longest Increasing Subsequence (LIS) is due to the nature of these two problems.

For the LCS problem, we’re looking for a common subsequence that exists in both arrays, and it doesn’t matter where the elements are located in the array. However, in the LIS problem, the order of the elements is crucially important, and we’re specifically looking for an increasing subsequence.

Let’s take a look at an example to clarify this:

Consider the array: [3, 1, 2]. If you sort it, you’ll get [1, 2, 3]. The longest common subsequence is of length 2 (i.e., [1, 2] or [1, 3] or [2, 3]). But, the longest increasing subsequence in the original array is of length 3 (i.e., [1, 2, 3]). So, in this case, the LCS approach will give you a shorter length than the actual length of the LIS.

This issue arises because the LCS does not consider the relative ordering of the elements in the original unsorted array, which leads to incorrect results when applied to the LIS problem. Hence, the LCS method can be insufficient in certain cases when applied to find the LIS.

My understanding of the method: This method is NOT based on binary search in DP Swipe through the array, maintain a subsequence array s of INCREASING numbers For each new element x If x larger than s[s.size()-1], append x to s else find an element that’s just larger than x (whose previous is smaller than x) Replace that element with x (above can be done with binary search) Why is this correct? s is NOT the increasing subsequence we’re looking for However, the length of s is the correct answer when we replace s[i] with x we don’t change the length of answer, but we changed the potential best candidate the idea is to try to make each position’s number as small as possible the actual sequence only changes when we append a number, otherwise it’s just a “virtual change”, meaning we don’t change the current sequence, but we try to make each number small so we’ll have a larger chance to append more numbers

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution {
public:
    int lengthOfLIS(vector<int>& nums) {
        if(nums.size()==0) return 0;
        vector<int> dp;
        dp.push_back(nums[0]);
        for(int i=1; i<nums.size(); i++)
        {
            if(nums[i]>dp.back())
            {
                dp.push_back(nums[i]);
            }
            else
            {
                int pos=lower_bound(dp.begin(), dp.end(), nums[i])-dp.begin();
                dp[pos]=nums[i];
            }
        }
        return dp.size();
    }
};

Your understanding of the method is correct. Let’s break down the approach you described:

  1. Initialize an empty sequence dp.
  2. Iterate through each number num in the input nums.
    • If num is larger than the last element in dp, you append num to dp. This step represents a situation where the current number can extend the longest increasing subsequence found so far.
    • If num is not larger than the last element in dp, you replace the first element in dp that is greater than or equal to num with num. This is achieved by using the lower_bound function, which performs a binary search to find the correct position. The replacement ensures that smaller numbers have higher priority to be included in dp as it increases the possibility of forming a longer increasing subsequence in the future.

The key insight here is that although dp is not necessarily a valid increasing subsequence of the input nums, its length always represents the length of the longest increasing subsequence found so far. The replacement operation doesn’t change the length of dp, but keeps it ready to be extended by future numbers.

Your Python code correctly implements this approach. The time complexity is O(n log n), where n is the length of the input nums, because each insertion operation can be done in O(log n) time using binary search, and you perform this operation for each of the n numbers. The space complexity is O(n), as you need to store the dp array.

Similar Problems

Here are 10 problems that require similar problem-solving strategies or use similar underlying concepts as the problem we’ve just solved. The main concept tying these problems together is Dynamic Programming (DP), which is a method for solving complex problems by breaking them down into simpler, overlapping subproblems.

  1. Longest Palindromic Subsequence (LeetCode 516): Similar to our problem, this question requires us to find the longest subsequence with a specific property, in this case, being a palindrome.

  2. Longest Common Subsequence (LeetCode 1143): This problem shares the idea of finding the longest subsequence, but this time between two input sequences. We can apply a similar DP solution.

  3. Maximum Length of Pair Chain (LeetCode 646): Here, we need to find the longest chain of pairs. This is similar to finding the longest increasing subsequence in a list.

  4. Russian Doll Envelopes (LeetCode 354): The problem requires us to find the maximum number of envelopes that can be put one inside another. This is another variation of the longest increasing subsequence problem.

  5. Longest Increasing Path in a Matrix (LeetCode 329): This problem extends the longest increasing subsequence problem to a two-dimensional matrix.

  6. Best Time to Buy and Sell Stock with Cooldown (LeetCode 309): This problem requires us to consider the maximum profit that can be made with a series of buying and selling actions, with a cooldown period between each sell and subsequent buy, which we can solve with a DP approach similar to the one used in the longest increasing subsequence problem.

  7. Coin Change (LeetCode 322): This problem involves finding the fewest number of coins needed to make up a given amount, which can be solved using DP by breaking down the problem into smaller subproblems.

  8. Decode Ways (LeetCode 91): This problem requires us to count the number of ways a coded message can be decoded, which can be solved using a DP approach to keep track of the number of ways to decode up to each position in the input string.

  9. Unique Paths (LeetCode 62): This problem involves finding the number of unique paths from the top-left corner to the bottom-right corner of a grid, which can be solved using DP by considering the number of paths to reach each cell in the grid.

  10. Word Break (LeetCode 139): This problem asks us to determine if a string can be segmented into space-separated words from a given list. A DP approach similar to the one used in the longest increasing subsequence problem can be applied here, with the DP array storing whether the string up to each position can be segmented into words from the list.

Remember, while these problems are all related by the application of Dynamic Programming, they each have their own unique elements and may require modification or expansion of the DP concept in different ways.

Language Agnostic Coding Drills

The Python code aims to solve the problem of finding the length of the longest increasing subsequence (LIS) in a list of numbers. It employs the concept of binary search along with list manipulation in Python. The learning concepts involved can be broken down into:

  1. Basic Concepts: Variables, Lists, and Arithmetic Operations Understand the basics of creating variables, lists and perform operations like addition, subtraction, etc.

  2. Control Flow: If-else Conditional Statements Learn how to use conditional statements to control the flow of your program.

  3. List Manipulation: Appending and Updating Elements Learn how to add elements to a list and update values of list elements.

  4. Binary Search: bisect_left function Understand the concept of binary search and how to use it in Python. In this case, bisect_left function from the bisect module is used.

  5. For Loops Understand the concept of loops for iterating over the elements of a list.

  6. Functions and Methods Understand how to define and call functions. Understand what methods are and how they are used in Object-Oriented Programming (OOP).

  7. Classes in Python Learn the basics of defining classes and creating instances of classes in Python.

  8. Problem Solving and Algorithmic Thinking Understand how to approach a problem, in this case, finding the length of the Longest Increasing Subsequence (LIS) in an array. Learn how the combination of binary search and dynamic programming helps to solve this problem efficiently.

The list is ordered from basic to more complex concepts, and each of these can be practiced and implemented separately before combining into the final solution.

Targeted Drills in Python

  1. Basic Concepts: Variables, Lists, and Arithmetic Operations Task: Create a list of integers and perform arithmetic operations.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    # Initialize a list of integers
    nums = [1, 2, 3, 4, 5]
    
    # Perform addition and subtraction
    sum_nums = nums[0] + nums[1]
    sub_nums = nums[2] - nums[3]
    
    print(sum_nums)
    print(sub_nums)
    
  2. Control Flow: If-else Conditional Statements Task: Write a program that checks if a number is greater than 10.

    1
    2
    3
    4
    5
    6
    
    num = 15
    
    if num > 10:
        print("Number is greater than 10.")
    else:
        print("Number is less than or equal to 10.")
    
  3. List Manipulation: Appending and Updating Elements Task: Add an element to a list and update a value of an existing element in the list.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    nums = [1, 2, 3]
    
    # Append number 4 to the list
    nums.append(4)
    
    # Update the value at index 2 to 5
    nums[2] = 5
    
    print(nums)
    
  4. Binary Search: bisect_left function Task: Use bisect_left function to find the insertion point for a number in a sorted list.

    1
    2
    3
    4
    5
    6
    7
    8
    
    import bisect
    
    nums = [1, 3, 4, 4, 6, 8]
    
    # Find the insertion point for number 4
    idx = bisect.bisect_left(nums, 4)
    
    print(idx)
    
  5. For Loops Task: Write a program to iterate over the elements in a list and print each element.

    1
    2
    3
    4
    
    nums = [1, 2, 3, 4, 5]
    
    for num in nums:
        print(num)
    
  6. Functions and Methods Task: Define a function that takes a list and a number as arguments and appends the number to the list.

    1
    2
    3
    4
    5
    
    def append_to_list(lst, num):
        lst.append(num)
        return lst
    
    print(append_to_list([1, 2, 3], 4))
    
  7. Classes in Python Task: Define a Python class Numbers with a method sum_nums that takes two numbers and returns their sum.

    1
    2
    3
    4
    5
    6
    
    class Numbers:
        def sum_nums(self, num1, num2):
            return num1 + num2
    
    nums = Numbers()
    print(nums.sum_nums(5, 7))
    
  8. Problem Solving and Algorithmic Thinking Task: Given a list, find the length of the longest increasing subsequence (LIS).

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    
    from bisect import bisect_left
    
    def length_of_LIS(nums):
        lis = [nums[0]]
    
        for num in nums:
            if num > lis[-1]:
                lis.append(num)
            else:
                lis[bisect_left(lis, num)] = num
    
        return len(lis)
    
    print(length_of_LIS([10, 9, 2, 5, 3, 7, 101, 18]))
    

Each of these drills helps understand a specific concept that is used in the provided code.

Q&A

I’ve tried two kinds of DP solutions here. The first solution uses recursive + memoization, and the second solution is top-down DP. I think the time complexity for both solutions is O(n^2). However, the second one received a TLE (Time Limit Exceeded). I have faced similar issues when choosing between these two types of solutions.

Could someone tell me how to choose between these two? And what is the time complexity difference between them? Thanks!"

First One def lengthOfLIS(self, nums: List[int]) -> int: def dfs(start, prev_val, memo) : if start in memo.keys() : return memo[start] if start == len(nums) : return 0 count = 0 for i in range(start, len(nums)) : if nums[i] <= prev_val : continue count = max(count, dfs(i + 1, nums[i], memo)+1) memo[start] = count return count

return dfs(0, -float("inf"), dict())

Second One def lengthOfLIS(self, nums: List[int]) -> int: def dp(i, prev, memo) : if i == len(nums) : return 0 if (i,prev) in memo.keys() : return memo[(i,prev)] length = 0; case1 = 0; case2 = 0 if prev == - 1 or nums[i] > nums[prev] : case1 = dp(i+1, i, memo) + 1 case2 = dp(i+1, prev, memo) length = max(case1, case2) memo[(i,prev)] = length return length

return dp(0, -1, dict())

Both of the given solutions are aimed at solving the Longest Increasing Subsequence (LIS) problem, and they use different forms of Dynamic Programming (DP). Let’s analyze the differences and figure out why the second approach might lead to a Time Limit Exceeded (TLE) error.

  1. First Solution (Recursive + Memoization):

    • It relies on Depth First Search (DFS) and recursion.
    • It uses the current index and the previous value to define the state.
    • You have created a memo dictionary with start as the key, which means it doesn’t take the prev_val into account, making it a 1-dimensional DP.
    • The time complexity is closer to (O(n^2)) but may not cover all possible combinations of states.
  2. Second Solution (Top-Down DP):

    • It also uses recursion, but the structure is more traditional DP.
    • It uses the current index and previous index to define the state, making it a 2-dimensional DP.
    • The time complexity here is likely higher than (O(n^2)) due to the extra dimension.

Why the second one might lead to TLE?

  • Since the second solution is using a 2-dimensional state, it will traverse more possible states than the first one. This might lead to a higher time complexity, making it slower.
  • In the first approach, by using the value rather than the index, you might be omitting some unnecessary states, which can make it faster in practice.

How to choose between these two?

  • It often depends on the problem and the constraints.
  • If the 1-dimensional approach covers all necessary states, it is generally preferred since it will be more efficient in both time and space.
  • If the problem requires considering more dimensions, then the 2-dimensional approach or even higher-dimensional approaches might be necessary.

In this particular case, the first approach seems to cover the problem’s requirements with a more compact state space, making it a better choice.

If you wanted to make the second approach more efficient, you could try to reduce the number of states by observing the specific characteristics of the problem. For instance, in this problem, sorting the input could reduce the complexity of the problem. Analyzing the problem and reducing the state space is a common way to optimize DP solutions.