# Custom Comparators and Coordinate Compression

Authors: Darren Yao, Siyong Huang, Michael Cao, Benjamin Qi, Nathan Chen

Using a custom comparator to sort custom objects or values in a non-default order, and compressing values from a large range to a smaller one.

Resources | ||||
---|---|---|---|---|

IUSACO | partially based off this | |||

CPH | short overview of what this module will cover |

## Example - Wormhole Sort

Focus Problem – try your best to solve this problem before continuing!

View Internal SolutionWe won't discuss the full solution here, as some of the concepts necessary for solving this problem will be introduced later in Silver. However, many solutions to this problem start by sorting the edges in nondecreasing order of weight. For example, the sample contains the following edges:

1 2 9 1 3 7 2 3 10 2 4 3

After sorting, it should look like

2 4 3 1 3 7 1 2 9 2 3 10

With C++, the easiest method is to use a `vector`

of nested `pair`

s:

#include <bits/stdc++.h>using namespace std;#define f first#define s secondint main() {int M = 4;vector<pair<int, pair<int, int>>> v;for (int i = 0; i < M; ++i) {

or a `vector`

of `array<int,3>`

s or `vector<int>`

s:

int main() {int M = 4;vector<array<int, 3>> v; // or vector<vector<int>>for (int i = 0; i < M; ++i) {int a, b, w;cin >> a >> b >> w;v.push_back({w, a, b});}sort(begin(v), end(v));for (auto e : v) cout << e[1] << " " << e[2] << " " << e[0] << "\n";}

In Python, we can use a list of lists.

But in Java, we can't sort an `ArrayList`

of `ArrayList`

s without writing some
additional code. What should we do?

- If we only stored the edge weights and sorted them, we would have a sorted list of edge weights, but it would be impossible to tell which weights corresponded to which edges.
- However, if we create a
**class**representing the edges and define a**custom comparator**to sort them by weight, we can sort the edges in ascending order while also keeping track of their endpoints.

## Classes

First, we need to define a **class** that represents what we want to sort. In
our example we will define a class `Edge`

that contains the two endpoints of the
edge and the weight.

C++

### C++

A C++ `struct`

is the same as a `class`

in C++, but all members are public by
default.

#include <bits/stdc++.h>using namespace std;struct Edge {int a, b, w;};/* alternatively,class Edge {public:

Java

import java.util.*;public class Sol {static class Edge {int a, b, w;public Edge(int _a, int _b, int _w) {a = _a;b = _b;w = _w;}

Python

class Edge:def __init__(self, a, b, w):self.a = aself.b = bself.w = wv = []M = 4for i in range(M):a, b, w = map(int, input().split())v.append(Edge(a, b, w))for e in v:print(e.a, e.b, e.w)

## Comparators

Normally, sorting functions rely on moving objects with a lower value in front of objects with a higher value if sorting in ascending order, and vice versa if in descending order. This is done through comparing two objects at a time.

C++

What a comparator does is compare two objects as follows, based on our comparison criteria:

- If object $x$ is less than object $y$, return
`true`

- If object $x$ is greater than or equal to object $y$, return
`false`

Essentially, the comparator determines whether object $x$ belongs to the left of object $y$ in a sorted ordering.

### Warning!

A comparator **must** return false for two equal objects (not doing so results
in undefined behavior and potentially a verdict of wrong answer or runtime
error).

In addition to returning the correct answer, comparators should also satisfy the following conditions:

- The function must be consistent with respect to reversing the order of the arguments: if $x \neq y$ and
`compare(x, y)`

is`true`

, then`compare(y, x)`

should be`false`

and vice versa. - The function must be transitive. If
`compare(x, y)`

is true and`compare(y, z)`

is true, then`compare(x, z)`

should also be true. If the first two compare functions both return`false`

, the third must also return`false`

.

### Method 1 - Overloading the Less Than Operator

This is the easiest to implement. However, it only works for objects (not primitives) and it doesn't allow you to define multiple ways to compare the same type of class.

In the context of Wormhole Sort (note the use of const Edge&):

#include <bits/stdc++.h>using namespace std;struct Edge {int a, b, w;bool operator<(const Edge &y) { return w < y.w; }};int main() {int M = 4;

We can also overload the operator outside of the class:

struct Edge {int a, b, w;};bool operator<(const Edge &x, const Edge &y) { return x.w < y.w; }

or within it using friend:

struct Edge {int a, b, w;friend bool operator<(const Edge &x, const Edge &y) { return x.w < y.w; }};

### Method 2 - Comparison Function

This works for both objects and primitives, and you can declare many different comparators for the same object.

#include <bits/stdc++.h>using namespace std;struct Edge {int a, b, w;};bool cmp(const Edge &x, const Edge &y) { return x.w < y.w; }int main() {

We can also use lambda expressions in C++11 or above:

sort(begin(v), end(v), [](const Edge &x, const Edge &y) { return x.w < y.w; });

Java

What a `Comparator`

does is compare two objects as follows, based on our
comparison criteria:

- If object $x$ is less than object $y$, return a negative integer.
- If object $x$ is greater than object $y$, return a positive integer.
- If object $x$ is equal to object $y$, return 0.

In addition to returning the correct number, comparators should also satisfy the following conditions:

- The function must be consistent with respect to reversing the order of the arguments: if
`compare(x, y)`

is positive, then`compare(y, x)`

should be negative and vice versa. - The function must be transitive. If
`compare(x, y) > 0`

and`compare(y, z) > 0`

, then`compare(x, z) > 0`

. Same applies if the compare functions return negative numbers. - Equality must be consistent. If
`compare(x, y) = 0`

, then`compare(x, z)`

and`compare(y, z)`

must both be positive, both negative, or both zero. Note that they don't have to be equal, they just need to have the same sign.

Java has default functions for comparing `int`

, `long`

, `double`

types. The
`Integer.compare()`

, `Long.compare()`

, and `Double.compare()`

functions take two
arguments $x$ and $y$ and compare them as described above.

There are two ways of implementing this in Java: `Comparable`

, and `Comparator`

.
They essentially serve the same purpose, but `Comparable`

is generally easier
and shorter to code. `Comparable`

is a function implemented within the class
containing the custom object, while `Comparator`

is its own class.

### Method 1 - Comparable

We'll need to put `implements Comparable<Edge>`

into the heading of the class.
Furthermore, we'll need to implement the `compareTo`

method. Essentially,
`compareTo(x)`

is the `compare`

function that we described above, with the
object itself as the first argument, or `compare(self, x)`

.

When using Comparable, we can just call `Arrays.sort(arr)`

or
`Collections.sort(list)`

on the array or list as usual.

import java.util.*;public class Sol {static class Edge implements Comparable<Edge> {int a, b, w;public Edge(int _a, int _b, int _w) {a = _a;b = _b;w = _w;}

### Method 2 - Comparator

If instead we choose to use `Comparator`

, we'll need to declare a second class
that implements `Comparator<Edge>`

:

import java.util.*;public class Sol {static class Edge {int a, b, w;public Edge(int _a, int _b, int _w) {a = _a;b = _b;w = _w;}

When using `Comparator`

, the syntax for using the built-in sorting function
requires a second argument: `Arrays.sort(arr, new Comp())`

, or
`Collections.sort(list, new Comp())`

.

Python

### Defining Less Than Operator

class Edge:def __init__(self, a, b, w):self.a = aself.b = bself.w = wdef __lt__(self, other): # lt means less thanreturn self.w < other.w

### Key Function

This method maps an object to another comparable datatype with which to be sorted. This is the preferred method if you are only sorting something once. In this case we map edges to their weights.

class Edge:def __init__(self, a, b, w):self.a = aself.b = bself.w = wv = []M = 4for i in range(M):a, b, w = map(int, input().split())v.append(Edge(a, b, w))v.sort(key=lambda x: x.w)for e in v:print(e.a, e.b, e.w)

### Comparison Function

A comparison function in Python must satisfy the same properties as a comparator
in Java. Note that old-style cmp functions are
no longer supported,
so the comparison function must be converted into a key function with
`cmp_to_key`

. Most of the
time, it is better to use the key function, but in the rare case that the
comparison function is not easily represented as a key function, we can use
this.

from functools import cmp_to_keyclass Edge:def __init__(self, a, b, w):self.a = aself.b = bself.w = w

## Variations

### Sorting in Decreasing Order of Weight

We can replace all occurrences of `x.w < y.w`

with `x.w > y.w`

in our C++ code.
Similarly, we can replace all occurrences of `Integer.compare(x, y)`

with
`-Integer.compare(x, y)`

in our Java code. In Python, we can pass the parameter
`reverse=True`

to the `sort`

or `sorted`

function.

### Sorting by Two Criteria

Now, suppose we wanted to sort a list of `Edge`

s in ascending order, primarily
by weight and secondarily by first vertex (`a`

). We can do this quite similarly
to how we handled sorting by one criterion earlier. What the comparator function
needs to do is to compare the weights if the weights are not equal, and
otherwise compare first vertices.

C++

struct Edge {int a, b, w;bool operator<(const Edge &y) {if (w != y.w) return w < y.w;return a < y.a;}};

Java

static class Edge implements Comparable<Edge> {int a, b, w;public Edge(int _a, int _b, int _w) {a = _a;b = _b;w = _w;}public int compareTo(Edge y) {if (w != y.w) return Integer.compare(w, y.w);return Integer.compare(a, y.a);}}

Python

In Python, tuples have a natural order based on their elements in order. We can take advantage of this to write a comparator:

class Edge:def __init__(self, a, b, w):self.a = aself.b = bself.w = wdef __lt__(self, other): # lt means less thanreturn (self.w, self.a) < (other.w, other.a)

This also gives an easy way to write a key function to sort in this way:

edges: list[Edge]edges.sort(key=lambda edge: (edge.w, edge.a))

Sorting by an arbitrary number of criteria is done similarly.

C++

Java

With Java, we can implement a comparator for arrays of arbitrary length (although this might be more confusing than creating a separate class).

import java.util.*;public class Sol {static class Comp implements Comparator<int[]> {public int compare(int[] a, int[] b) {for (int i = 0; i < a.length; ++i)if (a[i] != b[i]) return Integer.compare(a[i], b[i]);return 0;}}

Python

## Coordinate Compression

Coordinate compression describes the process of mapping each value in a list to its index if that list was sorted. For example, the list $\{7, 3, 4, 1\}$ would be compressed to $\{3, 1, 2, 0\}$. Notice that $1$ is the least value in the first list, so it becomes $0$, and $7$ is the greatest value, so it becomes $3$, the largest index in the list.

When we have values from a large range, but we only care about their relative
order (for example, if we have to know if one value is above another),
coordinate compression is a simple way to help with implementation. For example,
if we have a set of integers ranging from $0$ to $10^9$, we can't use them as
array indices because we'd have to create an array of size $10^9$, which would
surely cause a `Memory Limit Exceeded`

verdict. However, if there are only
$N \leq 10^6$ such integers, we can coordinate-compress their values, which
guarantees that the values will all be in the range from $0$ to $N-1$, which
*can* be used as array indices.

Focus Problem – try your best to solve this problem before continuing!

View Internal Solution### Example 1

A good example of coordinate compression in action is in the solution of *USACO
Rectangular Pasture.* Again, we won't delve into the full solution but instead
discuss how coordinate compression is applied. Since the solution uses
2D-prefix sums (another Silver topic), it is helpful if
all point coordinates are coordinate-compressed to the range $0$ to $N-1$ so
they can be used as array indices. Without coordinate compression, creating a
large enough array would result in a `Memory Limit Exceeded`

verdict.

Below you will find the solution to Rectangular Pasture, which uses coordinate compression at the beginning. Observe how a custom comparator is used to sort the points:

C++

#include <algorithm>#include <iostream>using namespace std;typedef pair<int, int> Point;bool ycomp(Point p, Point q) { return p.second < q.second; }const int MAX_N = 2500;int pref[MAX_N + 1][MAX_N + 1];

Java

import java.util.Arrays;import java.util.Comparator;import java.util.Scanner;public class RectangularPasture {static int[][] sums;static int getSum(int fromX, int toX, int fromY, int toY) {return sums[toX][toY] - sums[fromX - 1][toY] - sums[toX][fromY - 1] +sums[fromX - 1][fromY - 1];

Python

### Warning!

Due to Python's constant factor, the below code will TLE on a couple test cases despite having the correct time complexity.

import sysinput = sys.stdin.readlinen = int(input().strip())points = [list(map(int, input().strip().split())) for _ in range(n)]points.sort(key=lambda i: i[1])for i in range(n):points[i][1] = i + 1

The solution to Rectangular Pasture directly replaces coordinates with their compressed values, and forgets the real values of the coordinates because they are unnecessary. However, there may be problems for which we need to also remember the original values of coordinates that we compress.

### Example 2

Focus Problem – try your best to solve this problem before continuing!

This problem will require prefix sums and coordinate compression. However, the
implementation of coordinate compression in this solution will also require
remembering values in addition to compressing them (as opposed to just replacing
the original values, as in the last problem). If you just want to focus on the
implementation of coordinate compression and how it can switch between
compressed indices and original values, see the contracted code below. `indices`

is a list of values that need to be compressed. After it gets sorted and has
duplicate values removed, it is ready to use. The method `getCompressedIndex`

takes in an original value, and binary searches for its
position in `indices`

to get its corresponding compressed index. To go from a
compressed index to an original value, the code can just access that index in
`indices`

.

We also provide a more detailed explanation:

Detailed Explanation

C++

#include <bits/stdc++.h>using namespace std;typedef long long ll;ll difference_array[400005];// difference_array[i] = the difference of the values between special intervals// i-1 and i

Java

import java.io.*;import java.util.*;public class StaticRangeQueries {// custom class that stores a few integers (used for directly storing the// queries and updates given in the input)static class Query {int l, r, v;public Query(int l, int r, int v) {this.l = l;

## Problems

### Pro Tip

Many of the problems below may use other Silver concepts, such as prefix sums.

Status | Source | Problem Name | Difficulty | Tags | |
---|---|---|---|---|---|

CSES | Easy | ## Show TagsPrefix Sums, Sorting | |||

Silver | Normal | ## Show TagsPrefix Sums, Sorting | |||

Silver | Normal | ## Show TagsPrefix Sums, Sorting | |||

Silver | Normal | ## Show TagsSorting | |||

Silver | Normal | ## Show TagsSorting | |||

Gold | Normal | ## Show TagsPrefix Sums, Sorting | |||

CF | Normal | ## Show TagsSorting | |||

CF | Normal | ## Show TagsPrefix Sums, Sorting | |||

CF | Normal | ## Show TagsSorting | |||

Silver | Hard | ## Show TagsSorting | |||

Silver | Hard | ## Show TagsSorting | |||

Silver | Very Hard | ## Show TagsSorting |

## Quiz

What would the list $\{40, 21, -4, 1000, 5\}$ be coordinate compressed to?

### Module Progress:

### Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!