PrevNext
Not Frequent
 0/13

More Operations on Sorted Sets

Authors: Darren Yao, Benjamin Qi, Andrew Wang

Contributors: Aadit Ambadkar, Jeffrey Hu, Jason Sun

Finding the next element smaller or larger than a specified key in a sorted set, and using iterators with sorted sets.

C++

Resources
IUSACO

module is based off this

CP2

see decription of BSTs and heaps

Java

Resources
IUSACO

module is based off this

CP2

see decription of BSTs and heaps


In sets and maps where keys (or elements) are stored in sorted order, accessing or removing the next key higher or lower than some input key k is supported.

Keep in mind that insertion and deletion will take O(logN)\mathcal{O}(\log N) time for sorted sets, which is more than the average O(1)\mathcal{O}(1) insertion and deletion for hashsets, but less than the worst case O(N)\mathcal{O}(N) insertion and deletion for hashsets.

Using Iterators

In Bronze, we avoided discussion of any set operations involving iterators.

C++

Java

In Java, iterators are helpful for looping through sets.

Iterators used with HashSet would yield the elements in random order:

Set<Integer> set = new HashSet<Integer>();
set.add(1);
set.add(3);
set.add(0);
set.add(-2);
Iterator it = set.iterator();
while (it.hasNext()) {
Integer i = (Integer)it.next();
System.out.print(i + " "); // returns some random order
}

But with TreeSet the elements are in sorted order:

Set<Integer> set = new TreeSet<Integer>();
set.add(1);
set.add(3);
set.add(0);
set.add(-2);
Iterator it = set.iterator();
while (it.hasNext()) {
Integer i = (Integer)it.next();
System.out.print(i + " "); // returns -2 0 1 3
}

Instead of creating an iterator and looping with it like in C++, Java provides a for-each loop which creates a hidden iterator and loops with it automatically:

Set<Integer> set = new TreeSet<Integer>();
set.add(1);
set.add(3);
set.add(0);
set.add(-2);
for (int i : set) {
System.out.print(i + " "); // returns -2 0 1 3
}

Warning!

You shouldn't modify sets when traversing it with set iterators like in any other iterators for Collections (this INCLUDES when using a for-each loop). The only modification possible is using the iterator remove() method which can only be used once before calling the next() method.

Python

In Python, we can use iter() to obtain the iterator object of any iterable. Using next() on the iterator lets you iterate through the iterable. Below, a dictionary is used instead of a set because dictionaries keep order.

Iterating through a dictionary representation of an ordered set:

nums = {0: None, 1: None, 2: None, 4: None, 7: None}
iterator = iter(nums)
print(next(iterator)) # 0
print(next(iterator)) # 1
print(next(iterator)) # 2
print(next(iterator)) # 4
print(next(iterator)) # 7

Warning!

As of Python 3.6, dictionaries are ordered by insertion order. For older versions, you can use an OrderedDict from collections. Keep in mind that the above representation is of an ordered set, not a sorted set, because Python does not have sorted sets in its standard library, as you will see in the next section.

Python's iterators are fundamental to iteration and are used in its for loops and tuple unpacking. This is useful when you want more control over your iteration. It can also be simply used in cases where you just want the first or any element.

Sorted Sets

C++

The sorted std::set also supports:

  • lower_bound: returns an iterator to the least element greater than or equal to some element k.
  • upper_bound: returns an iterator to the least element strictly greater than some element k.
set<int> s;
s.insert(1); // [1]
s.insert(14); // [1, 14]
s.insert(9); // [1, 9, 14]
s.insert(2); // [1, 2, 9, 14]
cout << *s.upper_bound(7) << '\n'; // 9
cout << *s.upper_bound(9) << '\n'; // 14
cout << *s.lower_bound(5) << '\n'; // 9
cout << *s.lower_bound(9) << '\n'; // 9
cout << *s.begin() << '\n'; // 1
auto it = s.end();
cout << *(--it) << '\n'; // 14
s.erase(s.upper_bound(6)); // [1, 2, 14]

Warning!

Suppose that we replace s.upper_bound(7) with upper_bound(begin(s),end(s),7), which was the syntax that we used for vectors in the prerequisite module. This will still output the expected results, but its time complexity is linear in the size of the set s rather than logarithmic, so make sure to avoid it!

Java

TreeSets in Java allow for a multitude of additional operations:

  • first(): returns the lowest element in the set
  • last(): returns the greatest element in the set
  • lower(E v): returns the greatest element strictly less than v
  • floor(E v): returns the greatest element less than or equal to v
  • higher(E v): returns the least element strictly greater than v
  • ceiling(E v): returns the least element greater than or equal to v
TreeSet<Integer> set = new TreeSet<Integer>();
set.add(1); // [1]
set.add(14); // [1, 14]
set.add(9); // [1, 9, 14]
set.add(2); // [1, 2, 9, 14]
System.out.println(set.higher(7)); // 9
System.out.println(set.higher(9)); // 14
System.out.println(set.lower(5)); // 2
System.out.println(set.first()); // 1
System.out.println(set.last()); // 14
set.remove(set.higher(6)); // [1, 2, 14]
System.out.println(set.higher(23)); // ERROR, no such element exists

Python

Python does not have a sorted set nor a sorted map, so see the C++ or Java if you want an implementation from the standard library. However, if you are still curious regarding the Python implementation, you can find an AVL tree representation of a sorted set below.

Resources
DSA Python

definition and implementation of AVL Trees in Python

Warning!

You are not expected to know how to create an AVL tree in USACO Silver, nor is it recommended to do so for a representation of a sorted set since other languages have sorted sets built-in.

Since some online judges include additional libraries, an implementation of sorted sets from the sortedcontainers library (which is not included in most online judges like USACO) is shown below. All operations shown below are O(logN)\mathcal{O}(\log N) time, except for its O(NlogN)\mathcal{O}(N \log N) initialization.

from sortedcontainers import SortedSet
sorted_set = SortedSet([5, 1, 3, 2])
print(sorted_set) # SortedSet([1, 2, 3, 4, 7])
# Add elements
sorted_set.add(4)
sorted_set.add(6)
print(sorted_set) # SortedSet([1, 2, 3, 4, 5, 6])

One limitation of sorted sets is that we can't efficiently access the kthk^{th} largest element in the set, or find the number of elements in the set greater than some arbitrary xx. In C++, these operations can be handled using a data structure called an order statistic tree.

Sorted Maps

C++

The ordered map also allows:

  • lower_bound: returns the iterator pointing to the lowest entry not less than the specified key
  • upper_bound: returns the iterator pointing to the lowest entry strictly greater than the specified key respectively.
map<int, int> m;
m[3] = 5; // [(3, 5)]
m[11] = 4; // [(3, 5); (11, 4)]
m[10] = 491; // [(3, 5); (10, 491); (11, 4)]
cout << m.lower_bound(10)->first << " " << m.lower_bound(10)->second << '\n'; // 10 491
cout << m.upper_bound(10)->first << " " << m.upper_bound(10)->second << '\n'; // 11 4
m.erase(11); // [(3, 5); (10, 491)]
if (m.upper_bound(10) == m.end()) {
cout << "end" << endl; // Prints end
}

Java

The ordered map additionally supports firstKey / firstEntry and lastKey / lastEntry, returning the lowest key/entry and the highest key/entry, as well as higherKey / higherEntry and lowerKey / lowerEntry, returning the lowest key/entry strictly higher than the specified key, or the highest key/entry strictly lower than the specified key.

TreeMap<Integer, Integer> map = new TreeMap<Integer, Integer>();
map.put(3, 5); // [(3, 5)]
map.put(11, 4); // [(3, 5); (11, 4)]
map.put(10, 491); // [(3, 5); (10, 491); (11, 4)]
System.out.println(map.firstKey()); // 3
System.out.println(map.firstEntry()); // (3, 5)
System.out.println(map.lastEntry()); // (11, 4)
System.out.println(map.higherEntry(4)); // (10, 491)
map.remove(11); // [(3, 5); (10, 491)]
System.out.println(map.lowerKey(4)); // 3
System.out.println(map.lowerKey(3)); // ERROR

Python

Sorted maps in Python can be created by adding a dictionary to a sorted set, where each element in the sorted set is a key in the dictionary and values can be assigned with the dictionary. This is the most straightforward implementation, and the ground up implementation of a sorted set can be found in the above section.

Additionally, you can implement a SortedDict with the sortedcontainers library. All operations shown below are O(logN)\mathcal{O}(\log N) time, except for an O(NlogN)\mathcal{O}(N \log N) initialization and O(N)\mathcal{O}(N) time to get all items, keys, or values.

from sortedcontainers import SortedDict
sorted_map = SortedDict({1: "one", 4: "four", 3: "three"})
print(sorted_map) # SortedDict({1: 'one', 3: 'three', 4: 'four'})
# Add elements
sorted_map[2] = "two"
sorted_map[5] = "five"
# Output SortedDict({1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five'})

Multisets

A multiset is a sorted set that allows multiple copies of the same element.

C++

In addition to all of the regular set operations,

  • the count() method returns the number of times an element is present in the multiset. However, this method takes time linear in the number of matches so you shouldn't use it in a contest.
  • To remove a value once, use ms.erase(ms.find(val)).
  • To remove all occurrences of a value, use ms.erase(val).

Warning!

Using ms.erase(val) erases all instances of val from the multiset. To remove one instance of val, use ms.erase(ms.find(val))!

multiset<int> ms;
ms.insert(1); // [1]
ms.insert(14); // [1, 14]
ms.insert(9); // [1, 9, 14]
ms.insert(2); // [1, 2, 9, 14]
ms.insert(9); // [1, 2, 9, 9, 14]
ms.insert(9); // [1, 2, 9, 9, 9, 14]
cout << ms.count(4) << '\n'; // 0
cout << ms.count(9) << '\n'; // 3
cout << ms.count(14) << '\n'; // 1
ms.erase(ms.find(9));
cout << ms.count(9) << '\n'; // 2
ms.erase(9);
cout << ms.count(9) << '\n'; // 0

Java

While there is no multiset in Java, we can implement one using the TreeMap from values to their respective frequencies. We declare the TreeMap implementation globally so that we can write functions for adding and removing elements from it. The first, last, higher, and lower operations still function as intended; just use firstKey, lastKey, higherKey, and lowerKey respectively.

static TreeMap<Integer, Integer> multiset = new TreeMap<Integer, Integer>();
public static void main(String[] args) { ... }
static void add(int x) {
if (multiset.containsKey(x)) {
multiset.put(x, multiset.get(x) + 1);
} else {
multiset.put(x, 1);
}

Introductory Problems

StatusSourceProblem NameDifficultyTags
CSESEasy
Show TagsSorted Set
YSEasy
Show TagsSorted Set
CSESEasy
Show TagsSorted Set
CSESEasy
Show TagsBinary Search, Greedy, LIS, Sorted Set

Harder Example - Bit Inversions

Focus Problem – try your best to solve this problem before continuing!

View Internal Solution

Solution

We'll use iterators extensively.

Let the bit string be s=s0s1s2,sn1s=s_0s_1s_2\ldots,s_{n-1}. In the set dif, we store all indices ii such that sisi1s_i\neq s_{i-1} (including i=0i=0 and i=ni=n). If the elements of dif are 0=dif1<dif2<<difk=n0=dif_1<dif_2<\cdots<dif_k=n, then the longest length is equal to

max(dif2dif1,dif3dif2,,difkdifk1).\max(dif_2-dif_1,dif_3-dif_2,\ldots,dif_k-dif_{k-1}).

We can store each of these differences in a multiset ret; after each inversion, we'll need to output the maximum element of ret.

Inverting a bit at a 0-indexed position x corresponds to inserting x into dif if it not currently present or removing x if it is, and then doing the same with x+1. Whenever we insert or remove an element of dif, we should update ret accordingly.

C++

#include <bits/stdc++.h>
using namespace std;
#define sz(x) (x).size()
string s;
int m;
set<int> dif;
multiset<int> ret;

Java

Warning!

Java solutions are too slow for the CSES. Use C++ instead to get AC.

import java.io.*;
import java.util.*;
class BitInversion {
public static TreeMap<Integer, Integer> ret = new TreeMap<Integer, Integer>();
public static String s;
public static int m;
public static Set<Integer> dif = new TreeSet<Integer>();
public static void modify(int x) {

Note that multiset has a high constant factor, so replacing ret with a priority queue and an array that stores the number of times each integer occurs in the priority queue reduces the runtime by a factor of 2.

C++

#include <bits/stdc++.h>
using namespace std;
#define sz(x) (int)(x).size()
string s;
int m;
set<int> dif;
priority_queue<int> ret;
int cnt[200005];

Java

import java.io.*;
import java.util.*;
class BitInversion {
public static PriorityQueue<Integer> pq =
new PriorityQueue<Integer>(Collections.reverseOrder());
public static String s;
public static int m;
public static TreeSet<Integer> dif = new TreeSet<Integer>();
public static int cnt[];

Harder Problems

StatusSourceProblem NameDifficultyTags
SilverNormal
Show TagsSorted Set
CFNormal
Show TagsGreedy, Multiset, Sorting
CSESNormal
Show TagsGreedy, Sorted Set, Sorting
GoldNormal
Show TagsLinked List, Sorted Set
CFHard
Show TagsGreedy, Sorted Set, Sorting
CFHard
Show TagsSorted Set
GoldHard
Show TagsSorted Set
GoldHard
Show TagsGreedy, Sorted Set, Sorting

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

PrevNext