Thinking Process for DP Problems

PENDING EDIT.

Step 1. Categorize

Dynamic programming problems often fall into a few distinct categories. Recognizing these categories is crucial as it helps frame a new problem within a familiar context. Framing doesn’t mean copying a solution from another problem. Each dynamic programming problem is unique.

Categorize the problem into one of the following dynamic programming categories.

  • 0/1 Knapsack
  • Unbounded Knapsack
  • Shortest Path (eg: Unique Paths I/II)
  • Fibonacci Sequence (eg: House Thief, Jump Game)
  • Longest Common Substring/Subsequeunce

Example: Target Sum

Category: 0/1 Knapsack

Why? Our ‘Capacity’ is the target we want to reach ‘S’. Our ‘Items’ are the numbers in the input subset and the ‘Weights’ of the items are the values of the numbers. This question follows 0/1 and not unbounded knapsack because we can use each number only ONCE.

What is the variation? The twist on this problem from standard knapsack is that we must add ALL items in the subset to our knapsack. We can reframe the question into adding the positive or negative value of the current number to our knapsack in order to reach the target capacity ‘S’.

Step 2: States

What variables do we need to track in order to achieve the optimal result?

A state is usually defined as the particular condition that something is in at a specific point of time.

Similarly, in Dynamic Programming, a state is defined by a number of necessary variables at a particular instant that are required to calculate the optimal result. So, it is a combination of variables that will keep changing over different point of time. More the number of states, deeper the depth of the recursive solution and more memory required to cache the result of the states. This is because, you are most likely to have the same state at some point later. Two states are same, if all their corresponding variables have the same logical value. Therefore, it very well makes sense to preserve the results of the state to save time. This gives rise to what is known as Dynamic Programming.

In the classic Knapsack problem, you generally need two states:

  1. The remaining capacity of the knapsack.
  2. The number of items you’ve considered so far.

So, you typically require a 2D array to store these states. The array dimensions would be (remaining capacity) x (number of items considered).

This allows you to remember previous calculations so that you don’t redo work, making your solution more efficient.

Determine State Variables

In this example, we need index and current_sum variables.

Why Index? Index represents the index of the input subset we are considering. This tells us what values we have considered, what values we haven’t considered, and what value we are currently considering. As a general rule, index is a required state in nearly all dynamic programming problems, except for shortest paths which is row and column instead of a single index but we’ll get into that in a seperate post.

Why Current Sum? The question asks us if we can sum every item (either the positive or negative value of that item) in the subset to reach the target value. Current Sum gives us the sum of all the values we have processed so far. Our answer revolves around Current Sum being equal to Target.

Step 3 : Decisions

Dynamic Programming is about making the optimal decision. In order to make the optimal decision, we will have to try all decisions first. The MIT lecture on DP (highly recommended) refers to this as the guessing step. My brain works better calling this a decision instead of a guess. Decisions will have to bring us closer to the base case and lead us towards the question we want to answer. Base case is covered in Step 4 but really work in tandem with the decision step.

Question: What decisions do we have to make at each recursive call? Hint: As a general rule, Knapsack problems will require 2 decisions.

Answer: This problem requires we take ALL items in our input subset, so at every step we will be adding an item to our knapsack. Remember, we stated in Step 2 that “The question asks us if we can sum every item (either the positive or negative value of that item) in the subset to reach the target value.” The decision is:

Should we add the current numbers positive value Should we add the current numbers negative value As a note, knapsack problems usually don’t require us to take all items, thus a usual knapsack decision is to take the item or leave the item.

  1. Base Case

Base cases need to relate directly to the conditions required by the answer we are seeking. This is why it is important for our decisions to work towards our base cases, as it means our decisions are working towards our answer.

Let’s revisit the conditions for our answers.

We use all numbers in our input subset. The sum of all numbers is equal to our target ‘S’. Question: Identify the base cases. Hint: There are 2 base cases.

Answer: We need 2 base cases. One for when the current state is valid and one for when the current state is invalid.

Valid: Index is out of bounds AND current sum is equal to target ‘S’ Invalid: Index is out of bounds Why Index is out of bounds? A condition for our answer is that we use EVERY item in our input subset. When the index is out of bounds, we know we have considered every item in our input subset.

Why current sum is equal to target? A condition for our answer is that the sum of using either the positive or negative values of items in our input subet equal to the target sum ‘S’.

If we have considered all the items in our input subset and our current sum is equal to our target, we have successfully met both conditions required by our answer.

On the other hand, if we have considered all the items in our input subset and our current sum is NOT equal to our target, we have only met condition required by our answer. No bueno.

  1. Code it

If you’ve thought through all the steps and understand the problem, it’s trivial to code the actual solution.

  1. Optimize

Once we introduce memoization, we will only solve each subproblem ONCE. We can remove recursion altogether and avoid the overhead and potential of a stack overflow by introducing tabulation. It’s important to note that the top down recursive and bottom up tabulation methods perform the EXACT same amount of work. The only different is memory. If they peform the exact same amount of work, the conversion just requires us to specify the order in which problems should be solved. This post is really long now so I won’t cover these steps here, possibly in a future post.

https://leetcode.com/problems/target-sum/solutions/455024/DP-IS-EASY!-5-Steps-to-Think-Through-DP-Questions/

Let’s dive into the core characteristics of these types of Dynamic Programming problems.

0/1 Knapsack

  1. Characteristics: You have a set of items, each with a weight and a value. You can choose each item only once. The goal is to maximize the value while staying within a weight limit.

  2. Core Logic: Decide for each item whether to include it in the knapsack or not.

  3. Identification: If you have a resource limit (like weight) and you need to optimize some value (like cost), it’s likely a 0/1 Knapsack problem.

Unbounded Knapsack

  1. Characteristics: Similar to 0/1 Knapsack, but you can include each item multiple times.

  2. Core Logic: Decide for each item how many times to include it.

  3. Identification: If you have a resource limit but can use the same item multiple times, you’re dealing with Unbounded Knapsack.

Shortest Path (e.g., Unique Paths I/II)

  1. Characteristics: You have a grid or a graph, and you need to find the shortest path from one point to another.

  2. Core Logic: Calculate the minimum cost to reach each point from the starting point.

  3. Identification: If you need to find the least costly way to get from point A to point B in a grid or network, consider this category.

Fibonacci Sequence (e.g., House Thief, Jump Game)

  1. Characteristics: Problems where each state depends on one or more previous states, like Fibonacci numbers, where each number depends on the previous two.

  2. Core Logic: Store previous results to calculate the current result.

  3. Identification: When you can express the current state in terms of previous states, it’s often in this category.

Longest Common Substring/Subsequence

  1. Characteristics: You have two sequences, and you need to find the longest common subsequence (order matters) or substring (continuous sequence).

  2. Core Logic: Compare elements in both sequences to find common elements and store them in a way that allows backtracking the longest sequence.

  3. Identification: If you’re looking for the longest shared characteristics between two sequences, this is likely your category.

Differentiating Characteristics:

  1. Resource Constraints: 0/1 and Unbounded Knapsack problems have a resource limit (e.g., weight, volume).

  2. Reusability: Unbounded Knapsack allows reusing items; 0/1 does not.

  3. Network or Grid: Shortest Path problems usually involve a grid or network.

  4. Dependence on Previous States: Fibonacci Sequence problems have a direct dependence on previous states.

  5. Multiple Sequences: Longest Common Subsequence/Substring problems typically involve more than one sequence to compare.

Understanding these core aspects will help you identify the type of DP problem.