PrevNext
Rare
 0/10

(Optional) Bitsets

Author: Benjamin Qi

Prerequisites

  • Errichto - Bitwise Operations Pt 1

Examples of how bitsets give some unintended solutions on recent USACO problems.

Tutorial

tl;dr some operations are around 32-64x faster compared to a boolean array. See the C++ Reference for the operations you can perform.

Resources
CPHspeeding up Hamming distance
CF

Knapsack

Focus Problem – read through this problem before continuing!

Of course, the first step is to generate the sizes of each connected component.

1#include <bits/stdc++.h>
2using namespace std;
3
4struct DSU {
5 vector<int> e; void init(int N) { e = vector<int>(N,-1); }
6 int get(int x) { return e[x] < 0 ? x : e[x] = get(e[x]); }
7 bool sameSet(int a, int b) { return get(a) == get(b); }
8 int size(int x) { return -e[get(x)]; }
9 bool unite(int x, int y) { // union by size
10 x = get(x), y = get(y); if (x == y) return 0;

A naive knapsack solution would be as follows. For each 0icomps.size()0\le i\le \texttt{comps.size()}, let dp[i][j]=1\texttt{dp}[i][j]=1 if there exists a subset of the first ii components whose sizes sum to jj. Then the answer will be stored in dp[i]\texttt{dp}[i]. This runs in O(N2)O(N^2) and is too slow if implemented naively, but we can use bitset to speed it up!

Note: you can't store all NN bitsets in memory at the same time (more on that below).

Full Solution

Challenge: This solution runs in 0.3s\approx 0.3\text{s} when N=105N=10^5 and there are no edges. Find a faster solution which can also be sped up with bitset (my solution runs in 0.03s).

Cowpatibility (Gold)

Focus Problem – read through this problem before continuing!

Label the cows from 0N10\ldots N-1. For two cows xx and yy set adj[x][y]=1 if they share a common flavor. Then the number of pairs of cows that are compatible (counting each pair where xx and yy are distinct twice) is equal to the sum of adj[x].count() over all xx. It remains to compute adj[x] for all xx.

Unfortunately, storing NN bitsets each with NN bits takes up 500002324=312.5106\frac{50000^2}{32}\cdot 4=312.5\cdot 10^6 bytes of memory, which is greater than USACO's 256256 megabyte limit. We can reduce the memory usage by half in exchange for a slight increase in time by first computing the adjacency bitsets for all x[0,N/2)x\in [0,N/2), and then for all x[N/2,N)x\in [N/2,N) afterwards.

First, we read in all of the flavors.

1#include <bits/stdc++.h>
2using namespace std;
3
4typedef long long ll;
5typedef bitset<50000> B;
6const int HALF = 25000;
7
8int N;
9B adj[HALF];
10vector<int> flav[1000001];

Then for each flavor, we can look at all pairs of cows that share that flavor and update the adjacency lists for those x[0,HALF)x\in [0,HALF).

1int main() {
2 input();
3 for (int i = 1; i <= 1000000; ++i)
4 for (int x: flav[i]) if (x < HALF)
5 for (int y: flav[i]) adj[x][y] = 1;
6 for (int i = 0; i < HALF; ++i) ans += adj[i].count();
7}

adj[i].count() runs quickly enough since its runtime is divided by the bitset constant. However, looping over all cows in flav[i] is too slow if say, flav[i] contains all cows. Then the nested loop could take Θ(N2)\Theta(N^2) time! Of course, we can instead write the nested loop in a way that takes advantage of fast bitset operations once again.

1for (int i = 1; i <= 1000000; ++i) if (flav[i].size() > 0) {
2 B b; for (int x: flav[i]) b[x] = 1;
3 for (int x: flav[i]) if (x < HALF) adj[x] |= b;
4}

The full main function is as follows:

Full Solution

Apparently no test case contains more than 2500025000 distinct colors, so we don't actually need to split the calculation into two halves.

Lots of Triangles

Focus Problem – read through this problem before continuing!

First, we read in the input data. cross(a,b,c) is positive iff c lies to the left of the line from a to b.

1#include <bits/stdc++.h>
2using namespace std;
3
4typedef long long ll;
5typedef pair<ll,ll> P;
6
7#define f first
8#define s second
9
10ll cross(P a, P b, P c) {

There are O(N3)O(N^3) possible lots. Trying all possible lots and counting the number of trees that lie within each in O(N)O(N) for a total time complexity of O(N4)O(N^4) should solve somewhere between 2 and 5 test cases. Given a triangle t[0], t[1], t[2] with positive area, tree x lies within it iff x is to the left of each of sides (t[0],t[1]),(t[1],t[2]), and (t[2],t[0]).

1int main() {
2 input();
3 vector<int> res(N-2);
4 for (int i = 0; i < N; ++i) for (int j = i+1; j < N; ++j)
5 for (int k = j+1; k < N; ++k) {
6 vector<int> t = {i,j,k};
7 if (cross(v[t[0]],v[t[1]],v[t[2]]) < 0) swap(t[1],t[2]);
8 int cnt = 0;
9 for (int x = 0; x < N; ++x) {
10 if (cross(v[t[0]],v[t[1]],v[x]) <= 0) continue;

The analysis describes how to count the number of trees within a lot in O(1)O(1), which is sufficient to solve the problem. However, O(N)O(N) is actually sufficient as long as we divide by the bitset constant. Let b[i][j][k]=1 if k lies to the left of side (i,j). Then x lies within triangle (t[0],t[1],t[2]) as long as b[t[0]][t[1]][x]=b[t[1]][t[2]][x]=b[t[2]][t[0]][x]=1. We can count the number of x such that this holds true by taking the bitwise AND of the bitsets for all three sides and then counting the number of bits in the result.

Fast Solution

Knapsack Again

(GP of Bytedance 2020 F)

Given nn (n2104n\le 2\cdot 10^4) positive integers a1,,ana_1,\ldots,a_n (ai2104a_i\le 2\cdot 10^4), find the max possible sum of a subset of a1,,ana_1,\ldots,a_n whose sum does not exceed cc.

Consider the case when aic\sum a_i\ge c. The intended solution runs in O(nmax(ai))O(n\cdot \max(a_i)); see here for more information. However, we'll solve it with bitset instead.

As with the first problem in this module, let dp[i][j]=1\texttt{dp}[i][j]=1 if there exists a subset of the first numbers components that sums to jj. This solution runs in O(nai)O(n\cdot \sum a_i) time, which is too slow even if we use bitset.

Taking inspiration from this CF blog post, we'll first shuffle the integers randomly and perform the DP with the following modification:

  • If cinjX\left|\frac{ci}{n}-j\right| \ge X for some XX that we choose, then set dp[i][j]=0\texttt{dp}[i][j]=0.

Since we only need to keep track of 2X+12X+1 values for each ii, this solution runs in O(nX)O(nX) time, which runs in time with X=5105X=5\cdot 10^5 using bitset.

Intuitively, the random shuffle reduces the optimal subset to some random walk which should have variance at most maxaiN\max a_i\cdot \sqrt N, so it suffices to take XmaxaiNX\approx \max a_i\cdot \sqrt N. (Though I'm not completely convinced that this works, does anyone know how to bound the failure probability of this algorithm precisely?)

Solution

Other Applications

Use to speed up the following:

  • Gaussian Elimination in O(N3)O(N^3)
  • Bipartite matching in O(N3)O(N^3) (dense graph with NN vertices on each side)
  • BFS through dense graph in O(N2)O(N^2)
  • In general, passing solutions with an additional factor of NN.

Operations such as _Find_first() and _Find_next() mentioned in Errichto's blog can be helpful.

Resources
GFGonly resource I could find :P

A comment regarding the last two applications:

StatusSourceProblem NameDifficultyTagsSolution
CSAHard
Show Tags

DSU

Check CSA

In USACO Camp, a similar problem appeared with N105N\le 10^5 and a 66 second time limit (presumably to allow O(Nlog2N)O(N\log ^2N) solutions to pass). I had already done this problem but forgot how I had solved it decided to try something new. Try to guess what I did!

Problems

StatusSourceProblem NameDifficultyTagsSolution
PlatNormal
Show Tags

Bitset, Sliding Window

External Sol
Baltic OINormal
Show Tags

Bitset

External Sol
Baltic OINormal
Show Tags

Knapsack, Bitset

IZhOHard
Show Tags

Knapsack, Bitset

View Solution
Baltic OIHard
Show Tags

Knapsack, Bitset

COCIHard
Show Tags

Bitset

View Solution

Module Progress:

Give Us Feedback on (Optional) Bitsets!

PrevNext