Not Frequent
0/10

# Introduction to Sets & Maps

Authors: Darren Yao, Benjamin Qi, Allen Li, Jesse Choe

Maintaining collections of distinct elements with sorted sets.

### Prerequisites

Resources
IUSACO

module is based off this

CPH

covers similar material

Both Java and C++ contain two versions of sets and maps; one using sorting and the other using hashing. For Java and C++, we'll only introduce the former version in this module. For Python, we'll introduce the built-in sets and maps but keep in mind that they do not store elements in sorted order.

## Sets

Focus Problem – read through this problem before continuing!

View Internal Solution

A set is a collection of elements that contains no duplicates.

C++

In sorted sets, the elements are sorted in order of element. Insertions, deletions, and searches are all $\mathcal{O}(\log N)$, where $N$ is the number of elements in the set.

std::set implements sorted sets in C++. Some operations on an std::set named s include:

• s.insert(x), which adds the element x to s if not already present.
• s.erase(x), which removes the element x from s if present.
• s.count(x), which returns 1 if s contains x and 0 if it doesn't.

You can iterate through a set in sorted order using a for-each loop.

set<int> s;s.insert(1); // [1]s.insert(4); // [1, 4]s.insert(2); // [1, 2, 4]
// the insert method did nothing because 1 was already in the setcout << s.count(1) << endl; // 1s.erase(1); // [2, 4]cout << s.count(5) << endl; // 0s.erase(0); // [2, 4]

Java

In sorted sets, the pairs are sorted in order of key. Insertions, deletions, and searches are all $\mathcal{O}(\log N)$, where $N$ is the number of elements in the set.

Some operations on a TreeSet named set include:

• set.add(x), which adds the element x to set if not already present.
• set.remove(x), which removes the element x from set if present.
• set.contains(x), which checks whether set contains the element x.

You can iterate through a set in sorted order using a for-each loop.

Set<Integer> set = new TreeSet<>();set.add(1); // [1]set.add(4); // [1, 4]set.add(2); // [1, 2, 4]set.add(1); // [1, 2, 4]// the add method did nothing because 1 was already in the setSystem.out.println(set.contains(1)); // trueset.remove(1); // [2, 4]System.out.println(set.contains(5)); // falseset.remove(0); // [2, 4]

Python

Python's built-in set uses hashing to support $\mathcal{O}(1)$ insertion, deletion, and searches. Some operations on a Python set named s include:

• s.add(x): adds element x to s if not already present
• s.remove(x): removes an element x from set if present
• x in s: checks whether s contains the element x
s = set()s.add(1) # {1}s.add(4) # {1, 4}s.add(2) # {1, 4, 2}s.add(1) # {1, 4, 2}# the add method did nothing because 1 was already in the setprint(1 in s) # Trues.remove(1) # {4, 2}print(5 in s) # Falses.remove(0) # {4, 2}# if the element to be removed does not exist, nothing happens

Additional functions that sets support are discussed in the Silver module.

### Solution - Distinct Numbers

This problem asks us to calculate the number of distinct values in a given list.

#### Method 1 - Set

This is probably the easier of the two methods, but requires knowledge of sets. Because sets only store one copy of each value, we can insert all the numbers into a set, and then print out the size of the set.

C++

#include <bits/stdc++.h>
using namespace std;
int main() {	int n;	cin >> n;
set<int> distinctNumbers;	for (int i = 0; i < n; i++) {

Java

// Source: Daniel
import java.io.*;import java.util.*;
public class DistinctNumbers {	public static void main(String[] args) throws IOException {		Kattio io = new Kattio();		int n = io.nextInt();		Set<Integer> set = new HashSet<>();

Python

n = int(input()) # unusednums = [int(x) for x in input().split()]distinct_nums = set(nums)print(len(distinct_nums))

We can do this more efficiently by skipping the creation of the list, and use a set comprehension directly:

n = int(input()) # unuseddistinct_nums = {int(x) for x in input().split()}print(len(distinct_nums))

#### Method 2 - Sorting

Check out the solution involving sorting.

## Maps

Focus Problem – read through this problem before continuing!

A map is a set of entries, each consisting of a key and a value. In a map, all keys are required to be unique, but values can be repeated. Maps have three primary methods:

• one to add a specified key-value pairing
• one to retrieve the value for a given key
• one to remove a key-value pairing from the map

C++

In sorted maps, the pairs are sorted in order of key. Insertions, deletions, and searches are all $\mathcal{O}(\log N)$, where $N$ is the number of pairs in the map.

std::map implements sorted maps in C++. Some operations on an std::map named m include:

• m[key], which returns a reference to the value associated with the key key.
• If key is not present in the map, then the value associated with key is constructed using the default constructor of the value type. For example, if the value type is int, then calling m[key] for a key not within the map sets the value associated with that key to 0. As another example, if the value type is std::string, then calling m[key] for a key not within the map sets the value associated with that key to the empty string. More discussion regarding what happens in this case can be found here.
• Alternatively, m.at(key) behaves the same as m[key] if key is contained within m but throws an exception otherwise.
• m[key] = value will assign the value value to the key key.
• m.count(key), which returns the number of times the key is in the map (either one or zero), and therefore checks whether a key exists in the map.
• m.erase(key), which removes the map entry associated with the specified key if the key was present in the map.
map<int, int> m;m[1] = 5; // [(1, 5)]m[3] = 14; // [(1, 5); (3, 14)]m[2] = 7; // [(1, 5); (2, 7); (3, 14)]m[0] = -1; // [(0, -1); (1, 5); (2, 7); (3, 14)]m.erase(2); // [(0, -1); (1, 5); (3, 14)]cout << m[1] << endl; // 5cout << m.count(7) << endl; // 0cout << m.count(1) << endl; // 1cout << m[2] << endl; // 0

Java

In sorted maps, the pairs are sorted in order of key. Insertions, deletions, and searches are all $\mathcal{O}(\log N)$, where $N$ is the number of pairs in the map.

In a TreeMap, the put(key, value) method assigns a value to a key and places the key and value pair into the map. The get(key) method returns the value associated with the key. The containsKey(key) method checks whether a key exists in the map. Lastly, remove(key) removes the map entry associated with the specified key.

Map<Integer, Integer> map = new TreeMap<Integer, Integer>();map.put(1, 5); // [(1, 5)]map.put(3, 14); // [(1, 5); (3, 14)]map.put(2, 7); // [(1, 5); (2, 7); (3, 14)]map.remove(2); // [(1, 5); (3, 14)]System.out.println(map.get(1)); // 5System.out.println(map.containsKey(7)); // falseSystem.out.println(map.containsKey(1)); // true

Python

Colloquially, maps are referred to as dicts in python. They act as hash maps, so they have $\mathcal{O}(1)$ insertion, deletion, and searches.

d = {}d[1] = 5 # {1: 5}d[3] = 14 # {1: 5, 3: 14}d[2] = 7 # {1: 5, 2: 7, 3: 14}del d[2] # {1: 5, 3: 14}print(d[1]) # 5print(7 in d) # Falseprint(1 in d) # True

### Iterating Over Maps

C++

An std::map stores entries as pairs in the form {key, value}. To iterate over maps, you can use a for loop. The auto keyword suffices to iterate over any type of pair (here, auto substitutes for pair<const int, int>).

for (const auto& x : m) {	cout << x.first << " " << x.second << endl;}for (auto x : m) {	cout << x.first << " " << x.second << endl;}
/* both output the following:0 -11 53 14*/

The first method (iterating over const references) is generally preferred over the second because the second will make a copy of each element that it iterates over. Additionally, you can pass by reference when iterating over a map, allowing you to modify the values (but not the keys) of the pairs stored in the map:

for (auto& x : m) {	x.second = 3;}for (pair<int,int> x : m) {	cout << x.first << " " << x.second << endl;}
/*0 31 33 3*/

Java

To iterate over maps, you can use a for loop over the keys.

for (int k : m.keySet()) {	System.out.println(k + " " + m.get(k));}

Python

To iterate over dicts, there are three options. Dicts will be returned in the same order of insertion in Python 3.6+. You can iterate over the keys:

for key in d:	print(key)

You can iterate over the values:

for value in d.values():	print(value)

You can iterate over the key-value pairs:

for key, value in d.items():	print(key, value)

While you are free to change the values in a map when iterating over it (as demonstrated above), it is generally a bad idea to insert or remove elements of a map while iterating over it.

Python

For example, the following code attempts to remove every third entry from a map, but results in a runtime error.

d = {i: i for i in range(10)}cnt = 0for i in d:	cnt += 1	if cnt%3 == 0:		del d[i]
Traceback (most recent call last):
File "test.py", line 3, in <module>
for i in d:
RuntimeError: dictionary changed size during iteration

One way is to get around this is to create a new map.

d = {i: i for i in range(10)}d_new = dict(item for i, item in enumerate(d.items()) if i % 3 != 2)
print("new dict:", d_new)# new dict: {0: 0, 1: 1, 3: 3, 4: 4, 6: 6, 7: 7, 9: 9}

Another is to maintain a list of all the keys you want to remove and remove them after the iteration finishes:

d = {i: i for i in range(10)}to_remove = {key for i, key in enumerate(d) if i % 3 == 2}
for key in to_remove:	del d[key]print("new dict:", d)# new dict: {0: 0, 1: 1, 3: 3, 4: 4, 6: 6, 7: 7, 9: 9}

C++

For example, the following code attempts to remove every third entry from a map, but results in a segmentation fault.

map<int, int> m;for (int i = 0; i < 10; ++i)	m[i] = i;int cnt = 0;for (auto &it : m) {	cout << "Current Key: " << it.first << endl;	++cnt;	if (cnt % 3 == 0)		m.erase(it.first);}cout << "Entries:" << endl;for (const auto &it : m)	cout << it.first << " " << it.second << endl;

The reason is due to "iterators, pointers and references referring to elements removed by the function [being] invalidated" (as stated in the documentation for erase), though iterators are beyond the scope of this module.

One way to get around this is to just create a new map instead of removing from the old one.

map<int, int> m, M;for (int i = 0; i < 10; ++i)	m[i] = i;int cnt = 0;for (const auto &it : m) {	++cnt;	if (cnt % 3 != 0)		M[it.first] = it.second;}swap(m, M);

Another is to maintain a list of all the keys you want to erase and erase them after the iteration finishes.

map<int, int> m;for (int i = 0; i < 10; ++i)	m[i] = i;vector<int> to_erase;int cnt = 0;for (const auto &it : m) {	++cnt;	if (cnt % 3 == 0)		to_erase.push_back(it.first);}for (int key : to_erase)	m.erase(key);cout << "Remaining entries:" << endl;for (const auto &it : m)	cout << it.first << " " << it.second << endl;

Java

Modifying a Collection (Set, Map, etc.) in the middle of a for-each loop will cause a ConcurrentModificationException. See the following snippet for an example:

Set<Integer> s = new TreeSet<>();// s starts as {0, 1, 2}s.add(0); s.add(1); s.add(2);for (Integer a : s) {	s.remove(a); // ConcurrentModificationException thrown!!}

One work-around is to use Iterator and the .remove() method to remove elements while looping over them, like in the next code snippet:

Set<Integer> s = new TreeSet<>();// s starts as {0, 1, 2}s.add(0); s.add(1); s.add(2);
Iterator<Integer> iter = s.iterator();while (iter.hasNext()) {	int key = iter.next();	if (key == 0 || key == 2) {		iter.remove();	}

However, Iterator is outside the scope of this module.

The easiest option (in most cases) if you want to remove/insert mutiple elements at once is to use your Container's .addAll(c) or .removeAll(c) methods. That means that you should put all the elements you want to remove (or add) in a new Collection, and then use that new Collection as the parameter of the .addAll(c) or .removeAll(c) method that you call on your original Collection. See the following code snippet for an example (it works equivalently to the code above):

Set<Integer> s = new TreeSet<>();// s starts as {0, 1, 2}s.add(0); s.add(1); s.add(2);
Set<Integer> toRemove = new TreeSet<>();for (Integer a : s) {	if (a == 0 || a == 2) {		toRemove.add(a);	}}

## Problems

Some of these problems can be solved by sorting alone, though sets or maps could make their implementation easier.

StatusSourceProblem NameDifficultyTags
CSESEasy
Show TagsMap
BronzeEasy
Show TagsSet
BronzeNormal
Show TagsSet, Simulation
BronzeNormal
Show TagsMap
BronzeNormal
Show TagsMap, Sorting
SilverNormal
Show TagsMap
CFNormal
Show TagsPrefix Sums, Set
ACHard
Show TagsSet

What is the time complexity of insertions, deletions, and searches in a sorted set of size $N$?