PrevNext
Not Frequent
 0/9

Introduction to Sets & Maps

Authors: Darren Yao, Benjamin Qi, Allen Li, DRGSH

Maintaining collections of distinct elements in sorted order with ordered sets.

Resources
IUSACO

module is based off this

CPH

covers similar material

Both Java and C++ contain two versions of sets and maps; one using sorting and the other using hashing. We'll only introduce the former version in this module.

Sets

Focus Problem – read through this problem before continuing!

A set is a collection of objects that contains no duplicates. In ordered sets, the entries are sorted in order of key. Insertions, deletions, and searches are all O(logN)\mathcal{O}(\log N), where NN is the number of elements in the set.

C++

The operations on a C++ set include:

  • insert, which adds an element to the set if not already present.
  • erase, which deletes an element if it exists.
  • count, which returns 1 if the set contains the element and 0 if it doesn't.
set<int> s;
s.insert(1); // [1]
s.insert(4); // [1, 4]
s.insert(2); // [1, 2, 4]
s.insert(1); // [1, 2, 4]
// the add method did nothing because 1 was already in the set
cout << s.count(1) << endl; // 1
s.erase(1); // [2, 4]
cout << s.count(5) << endl; // 0
s.erase(0); // [2, 4]

Java

The operations on a TreeSet are add, which adds an element to the set if not already present, remove, which deletes an element if it exists, and contains, which checks whether the set contains that element.

Set<Integer> set = new TreeSet<Integer>();
set.add(1); // [1]
set.add(4); // [1, 4]
set.add(2); // [1, 2, 4]
set.add(1); // [1, 2, 4]
// the add method did nothing because 1 was already in the set
System.out.println(set.contains(1)); // true
set.remove(1); // [2, 4]
System.out.println(set.contains(5)); // false
set.remove(0); // [2, 4]

Python

Warning!

Ordered sets and maps are not built into Python. The Python OrderedDict stores keys in the same order as they were inserted in, not in sorted order.

The built in python unsorted set supports:

  • add(): Adds element to set
  • remove(): Removes element from set
  • x in set: Checks if element x is in the set
set = set()
set.add(1) # {1}
set.add(4) # {1, 4}
set.add(2) # {1, 4, 2}
set.add(1) # {1, 4, 2}
# the add method did nothing because 1 was already in the set
print(1 in set) # True
set.remove(1) # {4, 2}
print(5 in set) # False
set.remove(0); # {4, 2}
# if the element to be removed does not exist, nothing happens

Additional functions that sets support are discussed in the Silver module.

Solution - Distinct Numbers

This problem asks us to calculate the number of distinct values in a given list.

Method 1 - Set

This is probably the easier of the two methods, but requires knowledge of sets. Because sets only store one copy of each value, we can insert all the numbers into a set, and then print out the size of the set.

C++

#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
set<int> distinctNumbers;

Java

// Source: Daniel
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashSet;
import java.util.StringTokenizer;
public class DistinctNumbers {

Python

n = int(input()) # unused
nums = [int(x) for x in input().split()]
distinct_nums = set(nums)
print(len(distinct_nums))

We can do this more efficiently by skipping the creation of the list, and use a set comprehension directly:

n = int(input()) # unused
distinct_nums = {int(x) for x in input().split()}
print(len(distinct_nums))

Method 2 - Sorting

Check out the sorting solution.

Maps

Focus Problem – read through this problem before continuing!

A map is a set of ordered pairs, each containing a key and a value. In a map, all keys are required to be unique, but values can be repeated. Maps have three primary methods:

  • one to add a specified key-value pairing
  • one to retrieve the value for a given key
  • one to remove a key-value pairing from the map

Insertions, deletions, and searches are all O(logN)\mathcal{O}(\log N), where NN is the number of elements in the map.

C++

In a C++ map m:

  • the m[key] = value operator assigns a value to a key and places the key and value pair into the map. The operator m[key] returns the value associated with the key. If the key is not present in the map, then m[key] is set to 0.
  • The count(key) method returns the number of times the key is in the map (which is either one or zero), and therefore checks whether a key exists in the map.
  • Lastly, erase(key) removes the map entry associated with the specified key.
map<int, int> m;
m[1] = 5; // [(1, 5)]
m[3] = 14; // [(1, 5); (3, 14)]
m[2] = 7; // [(1, 5); (2, 7); (3, 14)]
m[0] = -1; // [(0, -1); (1, 5); (2, 7); (3, 14)]
m.erase(2); // [(0, -1); (1, 5); (3, 14)]
cout << m[1] << '\n'; // 5
cout << m.count(7) << '\n' ; // 0
cout << m.count(1) << '\n' ; // 1

Java

In a TreeMap, the put(key, value) method assigns a value to a key and places the key and value pair into the map. The get(key) method returns the value associated with the key. The containsKey(key) method checks whether a key exists in the map. Lastly, remove(key) removes the map entry associated with the specified key. All of these operations are O(1)\mathcal{O}(1), but again, due to the hashing, this has a high constant factor.

Map<Integer, Integer> map = new TreeMap<Integer, Integer>();
map.put(1, 5); // [(1, 5)]
map.put(3, 14); // [(1, 5); (3, 14)]
map.put(2, 7); // [(1, 5); (2, 7); (3, 14)]
map.remove(2); // [(1, 5); (3, 14)]
System.out.println(map.get(1)); // 5
System.out.println(map.containsKey(7)); // false
System.out.println(map.containsKey(1)); // true

Python

Colloquially, maps are referred to as dicts in python. They act as hash maps, so they actually have O(1)\mathcal{O}(1) insertion, deletion, and searches.

d = {}
d[1] = 5 # {1: 5}
d[3] = 14 # {1: 5, 3: 14}
d[2] = 7 # {1: 5, 2: 7, 3: 14}
del d[2] # {1: 5, 3: 14}
print(d[1]) # 5
print(7 in d) # False
print(1 in d) # True

Iterating Over Maps

C++

To iterate over maps, you can use a for loop.

for (pair<int,int> x : m) {
cout << x.first << " " << x.second << '\n';
}
for (auto x : m) {
cout << x.first << " " << x.second << '\n';
}
/* both output the following:
0 -1
1 5
3 14
*/

The map stores pairs in the form {key, value}. The auto keyword suffices to iterate over any type of pair. You can use these pairs normally, as introduced in this module.

Additionally, you can pass by reference when iterating over a map, like this:

for (auto& x : m) {
x.second = 3;
}
for (pair<int,int> x : m) {
cout << x.first << " " << x.second << '\n';
}
/*
0 3
1 3
3 3
*/

This allows you to modify the values of the pairs stored in the map.

Java

To iterate over maps, you can use a for loop over the keys.

for (int k : m.keySet()){
System.out.println(k + " " + m.get(k));
}

Python

To iterate over dicts, there are three options. Dicts will be returned in the same order of insertion in Python 3.6+. You can iterate over the keys:

for key in d:
print(key)

You can iterate over the values:

for value in d.values():
print(value)

You can iterate over the key-value pairs:

for key, value in d.items():
print(key, value)

Inserting / Deleting Keys While Iterating

While you are free to change the values in a map when iterating over it (as demonstrated above), be careful about inserting and deleting keys while iterating.

Python

This code will give a runtime error (although similar code will create a map with 11 entries in C++):

def iterate_insert():
d = {0:0}
for key in d:
if key == 10:
break
d[key] = 5
d[key+1] = 0
print("ENTRIES:")
for key,value in d.items():
print(key,value)
iterate_insert()
Traceback (most recent call last):
  File "test.py", line 17, in <module>
    iterate_insert()
  File "test.py", line 7, in iterate_insert
    for key in d:
RuntimeError: dictionary changed size during iteration

If you want to remove every third entry from a map, one way is to just create a new map.

d = {i: i for i in range(10)}
d_new = dict(item for i, item in enumerate(d.items()) if i % 3 != 2)
print("new dict:", d_new)
# new dict: {0: 0, 1: 1, 3: 3, 4: 4, 6: 6, 7: 7, 9: 9}

Another is to maintain a list of all the keys you want to delete and remove them after the iteration finishes:

d = {i: i for i in range(10)}
to_remove = {key for i, key in enumerate(d) if i % 3 == 2}
for key in to_remove:
del d[key]
print("new dict:", d)
# new dict: {0: 0, 1: 1, 3: 3, 4: 4, 6: 6, 7: 7, 9: 9}

C++

This code will work (adding keys while iterating over a map):

void iterate_insert() {
map<int,int> m; m[0] = 0; //starts with a single key
for (auto& p: m) { //adds keys in the loop until the key 10
if (p.f == 10) break;
p.s = 5;
m[p.f+1] = 0;
}
cout << "ENTRIES:\n";
for (pair<int,int> p: m)
cout << p.f << " " << p.s << "\n";

However, consider the following code, which attempts to remove every third entry from a map.

void iterate_remove_bad() {
map<int,int> m; for (int i = 0; i < 10; ++i) m[i] = i;
int cnt = 0;
for (auto it = begin(m); it != end(m); ++it) {
cout << "CURRENT KEY: " << it->f << "\n";
cnt ++;
if (cnt%3 == 0) m.erase(it);
}
cout << "REMAINING ENTRIES:\n";
for (pair<int,int> p: m)

However, we would expect the keys 2, 5, and 8 to be removed from the map, but this is not the case. 2 is correctly removed, but the next key removed is 4, not 5! And it seems that some keys are appearing more than once during the iteration.

As the documentation for erase mentions, "iterators, pointers and references referring to elements removed by the function are invalidated." So incrementing it after it has been erased from the map might not produce the intended result. If you're lucky, this will produce a segmentation fault. Unfortunately, sometimes (as in this case) the code will run without appearing to produce an error.

If we compile using -D_GLIBCXX_DEBUG and run the above, then

g++ -D_GLIBCXX_DEBUG whoops.cpp -o whoops && ./whoops

gives an error, as expected.

CURRENT KEY: 0
CURRENT KEY: 1
CURRENT KEY: 2
/usr/local/Cellar/gcc/10.1.0/include/c++/10.1.0/debug/safe_iterator.h:328:
In function:
    __gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>&
    __gnu_debug::_Safe_iterator<_Iterator, _Sequence,
    _Category>::operator++() [with _Iterator =
    std::_Rb_tree_iterator<std::pair<const int, int> >; _Sequence =
    std::__debug::map<int, int>; _Category = std::forward_iterator_tag]

Error: attempt to increment a singular iterator.

Objects involved in the operation:
    iterator "this" @ 0x0x7ffee963c870 {
      type = std::_Rb_tree_iterator<std::pair<int const, int> > (mutable iterator);
      state = singular;
      references sequence with type 'std::__debug::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > >' @ 0x0x7ffee963c8b0
    }
zsh: abort      ./whoops

Similarly, in Java,

If the map is modified while an iteration over the collection is in progress (except through the iterator's own remove operation), the results of the iteration are undefined.

As suggested by this StackOverflow post, the following code produces the intended results.

void iterate_remove_ok() {
map<int,int> m; for (int i = 0; i < 10; ++i) m[i] = i;
int cnt = 0;
for (auto it = begin(m), next_it = it; it != end(m); it = next_it) {
++next_it;
cout << "CURRENT KEY: " << it->f << "\n";
++cnt;
if (cnt%3 == 0) {
m.erase(it);
}

You could also just create a new map instead of removing from the old one.

void iterate_remove_ok_2() {
map<int,int> m, M; for (int i = 0; i < 10; ++i) m[i] = i;
int cnt = 0;
for (pair<int,int> p: m) {
++cnt;
if (cnt%3 != 0) M[p.f] = p.s;
}
swap(m,M);
cout << "REMAINING ENTRIES:\n";
for (pair<int,int> p: m)

Java

Modifying a Collection (Set, Map, etc.) in the middle of a for-each loop is unlikely to work, as it will probably cause a ConcurrentModificationException. See the following snippet for an example:

void iterate_remove_set_BAD() {
Set<Integer> s = new TreeSet<Integer>();
s.add(0); s.add(1); s.add(2);
for(Integer a : s) {
s.remove(a); // ConcurrentModificationException thrown!!
}
}

One work-around is to use Iterator and the .remove() method to remove elements while looping over them, like in the next code snippet:

void iterate_remove_set() {
Set<Integer> s = new TreeSet<Integer>();
//s starts as {0, 1, 2}
s.add(0); s.add(1); s.add(2);
Iterator<Integer> iter = s.iterator();
while(iter.hasNext()) {
int key = iter.next();
if(key == 0 || key == 2)
iter.remove();

However, Iterator is not commonly seen in Java, so the best option (in most cases) if you want to remove/insert mutiple elements at once is to use your Container's .addAll(c) or .removeAll(c) methods. That means that you should put all the elements you want to remove (or add) in a new Collection, and then use that new Collection as the parameter of the .addAll(c) or .removeAll(c) method that you call on your original Collection. See the following code snippet for an example (it works equivalently to the code above):

void iterate_remove_set_good() {
Set<Integer> s = new TreeSet<Integer>();
//s starts as {0, 1, 2}
s.add(0); s.add(1); s.add(2);
Set<Integer> toRemove = new TreeSet<Integer>();
for(Integer a : s) {
if(a == 0 || a == 2) toRemove.add(a);
}

Problems

Some of these problems can be solved by sorting alone, though sets or maps could make their implementation easier.

StatusSourceProblem NameDifficultyTags
CSESEasy
Show TagsMap
BronzeEasy
Show TagsSet
BronzeNormal
Show TagsSet, Simulation
BronzeNormal
Show TagsMap
BronzeNormal
Show TagsMap, Sorting
SilverNormal
Show TagsMap
CFNormal
Show TagsPrefix Sums, Set

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

PrevNext