Carson Tang
It is assumed that the reader already has some form of understanding of DSUs, the first portion of cp-algorithms is a good resource to read.
Do note that from the typical implementation of DSU, the only improvement is being able to perform Find in rather than . The trade-off being, more complicated code which can be more prone to error and increased memory usage.
The structure of each Node is as follows,
class Node { int val, size; Node *next, *head, *tail; Node(int val) : val(val), size(1), next(nullptr), head(nullptr), tail(nullptr); }
The size of the set is maintained by the head of the set that the node belongs to. Since each node has a pointer to the head of its set, we can access the head in . For two sets being merged, it is only necessary to traverse the smaller set to merge the two sets, which will be explained below.
Each node has a next pointer to the next node so that traversing the LinkedList of the nodes is possible.
The value at the head of each node will be the representative of each set.
As mentioned earlier, it is required to traverse the smaller set as we are merging two sets. In order to be able to traverse the entire set, it is important to maintain the tail of each set as the node belonging to the set that is to be merged may not necessarily be the tail. If it is not the tail, then it is not possible to traverse through the entire set.
After sets are merged the only conditions that will be maintained (will be true) for the nodes in the set are:
This is because when the smaller set is merged with the larger set, the nodes in the larger set are not iterated through, this way we can perform in .
class DSU { int n; Node** nodeList; DSU(int n) : n(n) { nodeList = new Node*[n]; for (int i = 0; i < n; i++) { Node* nd = new Node(i); nd->head = nd; nd->tail = nd; nodeList[i] = nd; } } int Find(int nd){ return nodeList[nd]->head->val; } void Union(int a, int b) { int parA = Find(a); int parB = Find(b); if (parA == parB) return; Node* larger_hd = nodeList[a]->head; Node* larger_tl = nodeList[a]->head->tail; Node* smaller_hd = nodeList[b]->head; Node* smaller_tl = nodeList[b]->head->tail; if (larger_hd->size < smaller_hd->size) { std::swap(larger_hd, smaller_hd); std::swap(larger_tl, smaller_tl); } larger_hd->size = larger_hd->size + smaller_hd->size; larger_hd->tail = smaller_tl; smaller_hd->next = larger_tl; while (smaller_tl->val != larger_tl->val) { smaller_tl->head = larger_hd; smaller_tl = smaller_tl->next; } } }
The DSU is first setup by having each node be its own set, meaning the head and tail of the node should point back to itself.
Maintaining the fact that the head pointer is the representative of the set, can be done in by returning the value of the head of the node whose set representative we want to determine.
We first check for whether the two nodes are already in the same set. Which is done by simply comparing the value of for both nodes.
int parA = Find(a); int parB = Find(b); if (parA == parB) return;
If the two nodes are not in the same set, then they need to be merged. We keep track of the head and tail of both the larger and smaller sets.
Node* larger_hd = nodeList[a]->head; Node* larger_tl = nodeList[a]->head->tail; Node* smaller_hd = nodeList[b]->head; Node* smaller_tl = nodeList[b]->head->tail;
Before we begin the traversal, we ensure that the size of the set from is indeed larger than the size of the set from , if not we can simply swap them.
if (larger_hd->size < smaller_hd->size) { std::swap(larger_hd, smaller_hd); std::swap(larger_tl, smaller_tl); }
Recall, the conditions that must be maintained,
We achieve this by updating the size of the head of the larger set to increase by the size of the smaller set.
larger_hd->size = larger_hd->size + smaller_hd->size; larger_hd->tail = smaller_tl;
We iterate through the smaller set from the tail and set the head of each node in the smaller set to be the head of the larger set.
while (smaller_tl->val != larger_tl->val) { smaller_tl->head = larger_hd; smaller_tl = smaller_tl->next; }
We must also point the next node of the head of the smaller set to be the tail of the larger set so that the entire set can be traversed if it is the smaller set of a future merge.
smaller_hd->next = larger_tl;
For other uses and implementations for a DSU refer to here.