CF - Connected Components?
Author: Daniel Qiu
Solution 1 - Faster DFS
Doing a standard DFS would be too slow. There can be at most edges, which would result in time complexity. To speed it up, we have to do two optimizations.
Optimization 1
We can store whether edges are non-existent using an array of hashsets instead of an adjacency list. This way we can find if there is no edge between two nodes in time. Using an adjacency matrix would take too much memory.
Optimization 2
Rather than using a standard boolean array to check if a node has not yet been visited, we can use a set of unvisited nodes. This way we only loop through the nodes that have not been visited yet and can find the next unvisited node in . We need to use upper_bound
to find
the next unvisited node after a DFS - an iterator may be invalidated due to the
node it is pointing to being destroyed when we remove it from the unvisited set.
Proof
We know that a node can either be skipped or visited. We also know that once a node is visited, it cannot be revisited. Therefore we will always visit nodes once. We also know that the sum of skips can be at most (each non-edge skips two nodes). Therefore the overall complexity will be .
Implementation
Time Complexity:
C++
#include <bits/stdc++.h>using namespace std;const int MAX_N = 200000;unordered_set<int> adj[MAX_N];set<int> unvis; // the set of nodes that have not been visitedint sz[MAX_N]; // sz[i] = size of the ith connected componentint cur = 0; // current amount of groupsvoid dfs(int x) {sz[cur]++;
Solution 2 - Selective DFS
Let the number of nodes in the graph, an arbitrary node, and the number of neighboring nodes (nodes that are directly connected to ).
We know that the size of the component containing node must be at least ( doesn't include ).
If and , and must intersect, otherwise the graph must have more than nodes.
Given this, we can just put all nodes with into the same group in time.
Now we can just process all nodes with , with DFS.
Proof
We can handle all nodes with in time.
Let there be nodes that have at most edges. Then, they also must have at least non-edges. Therefore, there must be at least non-edges (each edge accounts for two nodes).
The input states that the number of edges is less than , therefore , therefore .
Doing a simple DFS and checking whether two nodes intersect would be , which would fit under the time limit. The comes from processing input. Therefore, this runs under the time limit.
Implementation
Time Complexity:
C++
#include <bits/stdc++.h>using namespace std;using vi = vector<int>;#define pb push_backconst int MAX_N = 200000;unordered_set<int> adj[MAX_N];// group[i] = the group of the ith node// group[i] = 0 if the ith node hasn't been assigned a group yetint group[MAX_N];
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!