Matrix-Chain Multiplication

Matrix-chain multiplication refers to multiplying a sequence of matrices while minimizing the total cost.

Given n matrices A1, A2, …, An where matrix Ai has dimensions p(i-1) x p(i), the goal is to find the optimal parenthesization that minimizes the total number of scalar multiplications needed.

This can be solved efficiently using dynamic programming. The optimal substructure allows building up the solution.

Applications include optimizing complex linear algebra operations in fields like machine learning.

Example in Java:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// Matrices with dimensions
int[] p = {10, 20, 5, 50}; // 3 matrices

// dp[i][j] stores optimal cost for Ai -> ... -> Aj
int[][] dp = new int[p.length][p.length];

// Evaluate all parenthesizations 
for (int l=2; l < p.length; l++) {
  for (int i=0; i < p.length-l; i++) {
    int j = i+l;
    dp[i][j] = Integer.MAX_VALUE;
    
    for (int k=i+1; k < j; k++) {
      dp[i][j] = Math.min(dp[i][j], dp[i][k] + dp[k][j] + p[i]*p[k]*p[j]); 
    }
  }
}

// Minimum cost 
int minCost = dp[0][p.length-1];

Example in C++:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// Dimensions of matrices 
vector<int> p = {20, 10, 5, 50}; 

// dp table stores optimal cost
vector<vector<int>> dp(p.size(), vector<int>(p.size()));

for(int l = 2; l < p.size(); l++) {
  for(int i=0; i < p.size()-l; i++) {
    int j = i+l;
    dp[i][j] = INT_MAX;
    
    for(int k=i+1; k<j; k++) {
      dp[i][j] = min(dp[i][j], dp[i][k] + dp[k][j] + p[i]*p[k]*p[j]);
    }
  }
}

// Minimum cost
int minCost = dp[0][p.size()-1];

Example in Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Matrix dimensions
p = [40, 20, 30, 10, 50]  

dp = [[float("inf") for _ in range(len(p))] for _ in range(len(p))]

for l in range(2, len(p)):
  for i in range(len(p)-l):  
    j = i+l
    for k in range(i+1, j):
      cost = dp[i][k] + dp[k][j] + p[i]*p[k]*p[j]
      if cost < dp[i][j]:
        dp[i][j] = cost
        
# Minimum cost        
minCost = dp[0][len(p)-1]

In summary, matrix-chain multiplication finds the optimal order to multiply matrices to minimize cost using dynamic programming.

Matrix-Chain Multiplication involves finding the most efficient way to multiply a given sequence of matrices. The goal is to minimize the number of scalar multiplications. It’s a classic optimization problem and often solved using dynamic programming. Given a sequence of matrices (A_1, A_2, …, A_n), the optimal solution finds the best way to parenthesize the matrices to minimize computational cost.

Example Code

Java

In Java, dynamic programming can be used to solve the Matrix-Chain Multiplication problem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public class MatrixChainMultiplication {
  public static int matrixChainOrder(int p[], int n) {
    int[][] m = new int[n][n];
    for (int i = 1; i < n; i++) {
      m[i][i] = 0;
    }
    for (int len = 2; len < n; len++) {
      for (int i = 1; i < n - len + 1; i++) {
        int j = i + len - 1;
        m[i][j] = Integer.MAX_VALUE;
        for (int k = i; k < j; k++) {
          int q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j];
          if (q < m[i][j]) {
            m[i][j] = q;
          }
        }
      }
    }
    return m[1][n - 1];
  }

  public static void main(String[] args) {
    int arr[] = {1, 2, 3, 4};
    int n = arr.length;
    System.out.println("Minimum number of multiplications: " +
                       matrixChainOrder(arr, n));  // Output: 18
  }
}

m[i][j]: Stores the minimum number of scalar multiplications needed for matrices (i) to (j).

C++

The C++ implementation uses similar logic.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <limits.h>
using namespace std;

int matrixChainOrder(int p[], int n) {
  int m[n][n];
  for (int i = 1; i < n; i++) {
    m[i][i] = 0;
  }
  for (int len = 2; len < n; len++) {
    for (int i = 1; i < n - len + 1; i++) {
      int j = i + len - 1;
      m[i][j] = INT_MAX;
      for (int k = i; k < j; k++) {
        int q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j];
        if (q < m[i][j]) {
          m[i][j] = q;
        }
      }
    }
  }
  return m[1][n - 1];
}

int main() {
  int arr[] = {1, 2, 3, 4};
  int n = sizeof(arr) / sizeof(arr[0]);
  cout << "Minimum number of multiplications: " <<
          matrixChainOrder(arr, n) << endl;  // Output: 18
}

m[i][j]: Stores the minimum scalar multiplications needed for (i) to (j).

Python

Python’s code remains concise yet covers the essential logic.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def matrix_chain_order(p):
    n = len(p)
    m = [[0 for _ in range(n)] for _ in range(n)]
    for i in range(1, n):
        m[i][i] = 0
    for len in range(2, n):
        for i in range(1, n - len + 1):
            j = i + len - 1
            m[i][j] = float('inf')
            for k in range(i, j):
                q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j]
                if q < m[i][j]:
                    m[i][j] = q
    return m[1][n - 1]

arr = [1, 2, 3, 4]
print("Minimum number of multiplications:", matrix_chain_order(arr))  # Output: 18

m[i][j]: Stores the minimum number of scalar multiplications for (i) to (j).

Key Takeaways

Matrix-Chain Multiplication aims to find the most efficient parenthesization to minimize scalar multiplications.
Dynamic programming provides an effective method for solving this problem.
The core logic remains consistent across Java, C++, and Python, involving 2D arrays to store intermediate results.

Why Work Backwards

Working backwards in dynamic programming is a strategy that often makes it easier to split a problem into subproblems. Here’s how it plays a role in cleanly dividing an array into subproblems:

Clear Subproblem Identification

When you start at the end of the problem (the last element of an array, for example), the final state is usually well-defined. You can then divide the problem into smaller, earlier parts of the array with greater clarity because you know what you’re aiming for.

Simplified Boundary Conditions

Starting at the end simplifies boundary conditions. You usually know the exact outcome you want for the last element(s), which serves as a clear base case for your recursion or iterative loops. This makes initializing your dynamic programming table more straightforward.

Efficient Use of Computed Solutions

When you work backwards, you naturally move from the end state to initial states in a way that efficiently uses already computed solutions for subproblems. This ensures you’re not doing extra work and that you’re filling in your dynamic programming table in the most efficient way.

Easier to Write Code

Working backwards often leads to cleaner, less error-prone code. By simplifying the subproblems and boundary conditions, the actual coding becomes more straightforward. You’re less likely to make errors in indexing or off-by-one errors that are common when tackling complex problems from the front.

Example: Matrix-Chain Multiplication

Consider Matrix-Chain Multiplication. When you start from the end goal—finding the least costly way to multiply all matrices—you can easily partition the problem into subproblems. This can be questions like: “What’s the least costly way to multiply the last two matrices?” or “What’s the least costly way to multiply the last three matrices?”

In summary, working backwards provides you with a clear and efficient path to divide the array into subproblems. This ensures that each subproblem is meaningful and contributes to the solution of the original problem in an optimized way.

Burst Balloons

This problem is a variant of a classical DP problem, Matrix-Chain Multiplication, where we need to find the most efficient way to multiply a given sequence of matrices. The main idea is the same as above: DP, Divide and Conquer, and Thinking Backwards.