Has Not Appeared
 0/6

Randomness

Author: Ryan Fu

Using randomized algorithms to solve problems.

Edit This Page

Sometimes, we can use randomized algorithms to solve tasks quite difficult for deterministic algorithms.

Example - Ghd

Focus Problem – try your best to solve this problem before continuing!

Let the answer be kk. The key observation for this problem is that a randomly selected integer in aa has at least a 12\frac{1}{2} chance to be divisible by kk.

Thus, we can repeat the following procedure SS times to find the optimal answer with probability 112S1-\frac{1}{2^S}:

  • Let xx be a randomly selected value in aa.
  • For each divisor dxd \mid x, determine how many numbers dd divides into in aa.
  • Take our answer to be the minimum of all dd such that n2\geq \frac{n}{2} numbers in aa are divisible by dd

If we take S=15S = 15, and estimate the number of testcases in Codeforces to be 500500, our probability of success is (11215)500.984(1-\frac{1}{2^{15}})^{500} \approx .984, which is acceptable.

Implementation

Our approach has a time complexity of O(S(NlogA+d(A)2+A))\mathcal{O}\left(S \cdot \left(N\log A + d(A)^2 + \sqrt{A}\right)\right) , with the log factor occurring due to GCD computation, and the sqrt due to factoring.

#include <bits/stdc++.h>
using namespace std;
const int MAX_N = 1e6 + 5;
const int S = 15;
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
long long a[MAX_N], dv[MAX_N], cnt[MAX_N];
int main() {

Example - Count on a Tree II Striking Back

Focus Problem – try your best to solve this problem before continuing!

Trying to solve this problem directly - explicitly finding values for f(a,b)f(a,b) and f(c,d)f(c,d) - is quite difficult. Supporting online queries of range count distinct along with updates is quite difficult to achieve on an array in reasonable complexity, and the fact that this problem occurs on a tree, with relative large bound for n500n \leq 500 000000 makes it clear that solving for the value of ff is not intended.

Instead, we can attempt to use the property given in the statement that either f(a,b)2f(c,d)f(a,b) \geq 2f(c,d) or f(c,d)2f(a,b)f(c,d) \geq 2f(a,b). This inspires the following thought - if we map each color to a random number, the path minimum along the path with more distinct colors will probably be a smaller value.

We attempt to formalize this approach, and push the probability to an acceptable bound. Notably, the probability of failure is too high when we do the aforementioned random process once. To remedy this, we proceed as follows:

  • First we fix SS to be the number of "simulations" that we run.
  • Next, for each possible color 1cn1 \leq c \leq n, we define vcjv_{cj} as an independent random value within [0,1][0,1], for each 1jS1 \leq j \leq S.
  • Then, we construct SS trees TiT_i, of the same structure as the initial tree, but with vcuiv_{c_u i} marked on a node uu.
  • For each query comparing a,b,c,da,b,c,d, we output Yes\texttt{Yes} if
iSPathMin(Ti,a,b)<iSPathMin(Ti,c,d)\sum_{i\leq S} \text{PathMin}(T_i,a,b) < \sum_{i\leq S} \text{PathMin}(T_i,c,d)

and No\texttt{No} otherwise.

This approach should "combine" the probabilities in a way that makes the more likely event (the path with more distinct values achieving a smaller value) exponentially more likely. For some intuition on this, if we flip a coin weighted with 23\frac{2}{3} probability to be heads once, there is a 13\frac{1}{3} probability it comes up tails more than heads. However, if we flip it 9999 times, the probability it comes up more tails more than heads becomes

k=5099(99k)(13)k(23)99k.000309\sum_{k=50}^{99} \binom{99}{k} \left(\frac{1}{3}\right)^k \left(\frac{2}{3}\right)^{99-k} \approx .000309

Calculating the exact probability that our approach works is more difficult than the coin toss example, but follows the same principle of failure becoming exponentially more unlikely across several trials. We can use some assumptions (notably the central limit theorem) to calculate a rough approximation of the probability of success. Here is a graph of the probability of passing the problem given a certain value of SS. Note that the probability of success remains is largely the same across different counts of distinct colors along the path.

Graph

The graph was generated for the probability of answering 41044\cdot10^4 queries correctly in a row. This assumption is reasonable because the test data consists of only 1 large test case.

The python code used to generate the estimates is shown below:

import math
from scipy.stats import norm
def clt_probability(k, S):
"""
Computes the probability that the sum of S minima from 2k random draws
is less than the sum of S minima from k random draws, as approximated by the CLT.
The CLT approximation is given by:

Implementation

By supporting path queries with HLD, we can solve this problem in O(S(n+mlog2n))\mathcal{O}(S\cdot(n + m\log^2n)).

#include "bits/stdc++.h"
using namespace std;
using us = unsigned short;
const int MAX_N = 5e5 + 1;
const int S = 150;
const us INF = 65535;
us va[MAX_N][S];

Problems

Some of the listed problems do not require random algorithms as a solution, but can be made significantly easier through using random algorithms.

StatusSourceProblem NameDifficultyTags
MHCEasy
Show TagsRandom
CFNormal
Show TagsRandom
CFNormal
Show TagsRandom
CFNormal
Show TagsRandom

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!