Merging Data Structures Naive Solution Better Solution Full Code Generalizing Problems

Rare

0/9

Small-To-Large Merging

Authors: Michael Cao, Benjamin Qi

Contributor: Neo Wang

A way to merge two sets efficiently.

Edit This Page

Prerequisites

Merging Data Structures Naive Solution Better Solution Full Code Generalizing Problems

Resources
		CPH	18.4 - Merging Data Structures
		CF	Arpa - Sack (DSU on Tree)
		CF	tuwuna - Explaining DSU on Trees

Merging Data Structures

Obviously linked lists can be merged in $\mathcal{O}(1)$ time. But what about sets or vectors?

Distinct Colors

CSES - Easy

Focus Problem – try your best to solve this problem before continuing!

View Internal Solution

Let's consider a tree rooted at node $1$ , where each node has a color.

For each node, let's store a set containing only that node, and we want to merge the sets in the nodes subtree together such that each node has a set consisting of all colors in the nodes subtree. Doing this allows us to solve a variety of problems, such as query the number of distinct colors in each subtree.

Naive Solution

Suppose that we want merge two sets $a$ and $b$ of sizes $n$ and $m$ , respectively. One possibility is the following:

for (int x : b) a.insert(x);

which runs in $\mathcal{O}(m\log (n+m))$ time, yielding a runtime of $\mathcal{O}(N^2\log N)$ in the worst case. If we instead maintain $a$ and $b$ as sorted vectors, we can merge them in $\mathcal{O}(n+m)$ time, but $\mathcal{O}(N^2)$ is also too slow.

Better Solution

With just one additional line of code, we can significantly speed this up.

if (a.size() < b.size()) swap(a, b);
for (int x : b) a.insert(x);

Note that swap exchanges two sets in $\mathcal{O}(1)$ time. Thus, merging a smaller set of size $m$ into the larger one of size $n$ takes $\mathcal{O}(m\log n)$ time.

Claim: The solution runs in $\mathcal{O}(N\log^2N)$ time.

Proof: When merging two sets, you move from the smaller set to the larger set. If the size of the smaller set is $X$ , then the size of the resulting set is at least $2X$ . Thus, an element that has been moved $Y$ times will be in a set of size at least $2^Y$ , and since the maximum size of a set is $N$ (the root), each element will be moved at most $\mathcal{O}(\log N$ ) times.

Full Code

#include <bits/stdc++.h>

using namespace std;

const int MAX_N = 2e5;

// nodes will be 1-indexed like in the problem
vector<int> adj[MAX_N + 1];

set<int> colors[MAX_N + 1];

Generalizing

We can also merge other standard library data structures such as std::map or std:unordered_map in the same way. However, std::swap does not always run in $\mathcal{O}(1)$ time. For example, swapping std::arrays takes time linear in the sum of the sizes of the arrays, and the same goes for GCC policy-based data structures such as __gnu_pbds::tree or __gnu_pbds::gp_hash_table.

To swap two policy-based data structures a and b in $\mathcal{O}(1)$ time, use a.swap(b) instead. Note that for standard library data structures, swap(a,b) is equivalent to a.swap(b).

Problems

Source	Problem Name	Difficulty	Tags
CF	Lomsat gelral	Normal	Show Tags Merging
Platinum	Promotion Counting	Normal	Show Tags Indexed Set, Merging
Platinum	Disruption	Normal	Show Tags Merging
POI	2011 - Tree Rotations	Normal	Show Tags Indexed Set, Merging
IOI	2011 - Race	Normal	Show Tags Centroid, Merging
JOI	2020 - Joitter	Hard	Show Tags Merging
COI	2009 - Loza	Hard	Show Tags Merging
JOI	2019 - Virus	Very Hard	Show Tags Merging, SCC

Optional: Faster Merging

It's easy to merge two sets of sizes $n\ge m$ in $\mathcal{O}(n+m)$ or $(m\log n)$ time, but sometimes $O\left(m\log \left(1+\frac{n}{m}\right)\right)$ can be significantly better than both of these. Check "Advanced - Treaps" for more details. Also see this link regarding merging segment trees.

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

Join Forum

Table of Contents

Small-To-Large Merging

Prerequisites

Table of Contents

Merging Data Structures

Naive Solution

Better Solution

Full Code

Generalizing

Problems

Optional: Faster Merging

Module Progress:

Join the USACO Forum!

Table of Contents

Small-To-Large Merging

Prerequisites

Table of Contents

Merging Data Structures

Naive Solution

Better Solution

Full Code

Generalizing

Problems

Optional: Faster Merging

Module Progress:Not Started

Join the USACO Forum!

Module Progress: