Weighted-Union Heuristic

The weighted-union heuristic is an optimization for the disjoint-set data structure. It merges sets by rank and path compresses using node size rather than depth.

The steps are:

  1. Merge sets by linking smaller tree under larger
  2. Path compress during find using subtree size

This balances the trees by total size rather than depth.

Java example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class DSU {
  int[] size; // Store tree sizes

  public int find(int x) {
    if (x == parent[x]) { 
      return x;
    }
    return parent[x] = find(parent[x]); // Path compression
  }

  public void union(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      if (size[rootX] < size[rootY]) {
        parent[rootX] = rootY;
        size[rootY] += size[rootX];
      } else {
        parent[rootY] = rootX;
        size[rootX] += size[rootY];  
      }
    }
  }
}

C++ example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct DSU {
  vector<int> size;

  int find(int x) {
    if (x == parent[x]) {
      return x; 
    }
    return parent[x] = find(parent[x]);
  }

  void union(int x, int y) {
    int rootX = find(x);
    int rootY = find(y);
    if (rootX != rootY) {
      if (size[rootX] < size[rootY]) {
        parent[rootX] = rootY; 
        size[rootY] += size[rootX];
      } else {
        parent[rootY] = rootX;
        size[rootX] += size[rootY];
      }
    }
  }
};

Python example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class DSU:
  def __init__(self):
    self.parent = {}
    self.size = {}

  def find(self, x):
    if x not in self.parent:
      self.parent[x] = x
      self.size[x] = 1
    
    if x != self.parent[x]:
      self.parent[x] = self.find(self.parent[x])

    return self.parent[x]

  def union(self, x, y):
    rootX = self.find(x)
    rootY = self.find(y)
    if rootX != rootY:
      if self.size[rootX] < self.size[rootY]:
        self.parent[rootX] = rootY
        self.size[rootY] += self.size[rootX]
      else:
        self.parent[rootY] = rootX
        self.size[rootX] += self.size[rootY]

The weighted-union heuristic improves efficiency of union operations.

In data structures like disjoint-set, the weighted-union heuristic helps optimize the “union” operation. Here, the size (or weight) of each tree is tracked, and when uniting two sets, the smaller set is attached as a subtree to the root of the larger set. This reduces the height of the resulting tree, making future operations faster.

In the context of disjoint-set data structures, the weighted-union heuristic is a technique to keep the tree balanced when performing union operations. In this heuristic, the size (or weight) of each set (or tree) is kept track of, and when uniting two sets, the smaller set is attached to the root of the larger set. This helps in minimizing the tree height, leading to more efficient operations.

Textual Representation:

Let’s assume we have these sets initially:

  • Set A: {1, 2} with size 2
  • Set B: {3} with size 1
  • Set C: {4, 5, 6} with size 3

Graphical Representation:

  A(2)      B(1)      C(3)
   |                  /  \
  1,2                4----5
                              \
                               6

Here, the numbers in parentheses next to each set name are the sizes (or weights) of those sets.

Union Operations:

  1. Union(A, B):

    • The root of the smaller tree (B) is attached to the root of the larger tree (A).
    • The new weight of A becomes 3 (2 + 1).

    New Graphical Representation:

       A(3)
       /  \
     1,2--B
            \
             3
    
  2. Union(A, C):

    • Here, C is the larger tree.
    • So, the root of A is attached to the root of C.
    • The new weight of C becomes 6 (3 + 3).

    New Graphical Representation:

         C(6)
       /  |  \
      /   |   \
     4----5----A
                 \
                  B
                    \
                     3
    

The weighted-union heuristic ensures that the height of the trees is minimized, making find and union operations faster.

Solution

Java

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class WeightedUnion {
    private int[] parent, size;

    public WeightedUnion(int n) {
        parent = new int[n];
        size = new int[n];
        for (int i = 0; i < n; i++) {
            parent[i] = i;
            size[i] = 1;
        }
    }

    public int find(int x) {
        if (parent[x] == x) return x;
        return parent[x] = find(parent[x]);
    }

    public void unite(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);

        if (rootX == rootY) return;

        if (size[rootX] < size[rootY]) {
            parent[rootX] = rootY;
            size[rootY] += size[rootX];
        } else {
            parent[rootY] = rootX;
            size[rootX] += size[rootY];
        }
    }
}

C++

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class WeightedUnion {
private:
    vector<int> parent, size;

public:
    WeightedUnion(int n) : parent(n), size(n, 1) {
        for (int i = 0; i < n; ++i) parent[i] = i;
    }

    int find(int x) {
        if (parent[x] == x) return x;
        return parent[x] = find(parent[x]);
    }

    void unite(int x, int y) {
        int rootX = find(x);
        int rootY = find(y);

        if (rootX == rootY) return;

        if (size[rootX] < size[rootY]) {
            parent[rootX] = rootY;
            size[rootY] += size[rootX];
        } else {
            parent[rootY] = rootX;
            size[rootX] += size[rootY];
        }
    }
};

Python

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class WeightedUnion:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n

    def find(self, x):
        if self.parent[x] == x:
            return x
        self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def unite(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)

        if root_x == root_y:
            return

        if self.size[root_x] < self.size[root_y]:
            self.parent[root_x] = root_y
            self.size[root_y] += self.size[root_x]
        else:
            self.parent[root_y] = root_x
            self.size[root_x] += self.size[root_y]

Key Takeaways

  • Weighted-union heuristic improves the efficiency of the disjoint-set data structure.
  • It keeps track of the size of each set to minimize the height of trees.
  • The unite function joins two sets based on their sizes, making future find operations faster.