Path Compression

Path compression is an optimization technique for the disjoint set union data structure. It flattens the structure of the tree by making all nodes along the find path point directly to the root.

Java example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class DSU {
  int[] parent;

  public int find(int x) {
    if (x == parent[x]) {
      return x; 
    }
    return parent[x] = find(parent[x]); // Path compression
  }  

  public void union(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      parent[rootX] = rootY; 
    }
  }
}  

C++ example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
struct DSU {
  vector<int> parent;

  int find(int x) {
    if (x == parent[x]) {
      return x;
    }
    return parent[x] = find(parent[x]); // Path compression
  }

  void union(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      parent[rootX] = rootY;
    }
  }
};

Python example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class DSU:
  def __init__(self):
    self.parent = list(range(1001))

  def find(self, x):
    if x == self.parent[x]:
      return x
    self.parent[x] = self.find(self.parent[x]) # Path compression
    return self.parent[x]

  def union(self, x, y):
    rootX = find(x)
    rootY = find(y)
    if rootX != rootY:
      self.parent[rootX] = rootY

Path compression significantly improves performance of union-find data structure.

Path compression is another optimization technique for the Disjoint-Set Data Structure. It flattens the structure of the tree by making each node point to the root whenever the find function is called. This results in much faster subsequent find operations.

Algorithm

  1. Initialization: Standard initialization of parent array.
  2. Union: Standard union operation, possibly optimized with Union by Rank.
  3. Find: This operation finds the root and also compresses the path for each node along the way.

Java Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
public class DisjointSet {
  private int[] parent;
  
  public DisjointSet(int size) {
    parent = new int[size];
    for (int i = 0; i < size; i++) parent[i] = i;
  }

  public int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
  }

  public void union(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      parent[rootX] = rootY;
    }
  }
}

C++ Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <vector>

class DisjointSet {
public:
  std::vector<int> parent;
  
  DisjointSet(int size) : parent(size) {
    for (int i = 0; i < size; ++i) parent[i] = i;
  }
  
  int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
  }

  void unionSets(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      parent[rootX] = rootY;
    }
  }
};

Python Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class DisjointSet:
    def __init__(self, size):
        self.parent = [i for i in range(size)]

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
        if root_x != root_y:
            self.parent[root_x] = root_y

After implementing path compression, the time complexity for find and union operations becomes almost (O(1)), making it extremely efficient.