PrevNext

You're not signed in!

Sign in to save your progress and sync your settings across devices.

Not Frequent
 0/7

Introduction to Sets & Maps

Authors: Darren Yao, Benjamin Qi, Allen Li

Sorting, and maintaining collections of distinct elements with ordered sets.

Resources
IUSACOmodule is based off this
CPHcovers similar material

Both Java and C++ contain two versions of sets and maps; one using sorting and the other using hashing. We'll only introduce the former version in this module.

Sets

Focus Problem – read through this problem before continuing!

A set is a collection of objects that contains no duplicates. In ordered sets, the entries are sorted in order of key. Insertions, deletions, and searches are all O(logN)O(\log N), where NN is the number of elements in the set.

C++

The operations on a C++ set include:

  • insert, which adds an element to the set if not already present.
  • erase, which deletes an element if it exists.
  • count, which returns 1 if the set contains the element and 0 if it doesn't.
1set<int> s;
2s.insert(1); // [1]
3s.insert(4); // [1, 4]
4s.insert(2); // [1, 2, 4]
5s.insert(1); // [1, 2, 4]
6// the add method did nothing because 1 was already in the set
7cout << s.count(1) << endl; // 1
8s.erase(1); // [2, 4]
9cout << s.count(5) << endl; // 0
10s.erase(0); // [2, 4]

Java

The operations on a TreeSet are add, which adds an element to the set if not already present, remove, which deletes an element if it exists, and contains, which checks whether the set contains that element.

1Set<Integer> set = new TreeSet<Integer>();
2set.add(1); // [1]
3set.add(4); // [1, 4]
4set.add(2); // [1, 2, 4]
5set.add(1); // [1, 2, 4]
6// the add method did nothing because 1 was already in the set
7System.out.println(set.contains(1)); // true
8set.remove(1); // [2, 4]
9System.out.println(set.contains(5)); // false
10set.remove(0); // [2, 4]

Python

Warning!

Ordered sets and maps are not built into Python. The Python OrderedDict stores keys in the same order as they were inserted in, not in sorted order.

The built in python unsorted set supports:

  • add(): Adds element to set
  • remove(): Removes element from set
  • x in set: Checks if element x is in the set
1set = set()
2set.add(1) # [1]
3set.add(4) # [1, 4]
4set.add(2) # [1, 4, 2]
5set.add(1) # [1, 4, 2]
6# the add method did nothing because 1 was already in the set
7print(1 in set) # True
8set.remove(1) # [4, 2]
9print(5 in set) # False
10set.remove(0); # [4, 2]

Solution - Distinct Numbers

This problem asks us to calculate the number of distinct values in a given list.

Method 1 - Set

This is probably the easier of the two methods, but requires knowledge of sets. Because sets only store one copy of each value, we can insert all the numbers into a set, and then print out the size of the set.

C++

1#include <bits/stdc++.h>
2
3using namespace std;
4
5int main() {
6
7 int n;
8 cin >> n;
9
10 set<int> distinctNumbers;

Java

1// Source: Daniel
2
3import java.io.BufferedReader;
4import java.io.IOException;
5import java.io.InputStreamReader;
6import java.util.HashSet;
7import java.util.StringTokenizer;
8
9public class DistinctNumbers {
10

Python

1n = int(input())
2nums = [int(x) for x in input().split()]
3distinctNumbers = set(nums)
4print(len(distinctNumbers))

Method 2 - Sorting the Array

This problem is also solvable without sets. Sort a vector containing all the numbers. The answer is the number of times two adjacent numbers are not equal; more formally, the number of times aiai+1a_i \neq a_{i + 1}, plus one.

C++

1#include <bits/stdc++.h>
2
3using namespace std;
4
5int main() {
6
7 int n;
8 cin >> n;
9
10 int A[n];

Java

1import java.io.*;
2import java.util.*;
3
4public class DistinctNumbers {
5
6 public static void main(String[] args) throws IOException {
7 BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
8 int n = Integer.parseInt(br.readLine());
9 StringTokenizer st = new StringTokenizer(br.readLine());
10 int[] arr = new int[n];

Python

1n = int(input())
2arr = [int(x) for x in input().split()]
3arr = sorted(arr)
4
5ans = 1
6for i in range(n - 1):
7 if arr[i] != arr[i + 1]:
8 ans += 1
9
10print(ans)

Maps

Focus Problem – read through this problem before continuing!

A map is a set of ordered pairs, each containing a key and a value. In a map, all keys are required to be unique, but values can be repeated. Maps have three primary methods:

  • one to add a specified key-value pairing
  • one to retrieve the value for a given key
  • one to remove a key-value pairing from the map

Insertions, deletions, and searches are all O(logN)O(\log N), where NN is the number of elements in the map.

C++

In a C++ map m:

  • the m[key] = value operator assigns a value to a key and places the key and value pair into the map. The operator m[key] returns the value associated with the key. If the key is not present in the map, then m[key] is set to 0.
  • The count(key) method returns the number of times the key is in the map (which is either one or zero), and therefore checks whether a key exists in the map.
  • Lastly, erase(key) removes the map entry associated with the specified key.
1map<int, int> m;
2m[1] = 5; // [(1, 5)]
3m[3] = 14; // [(1, 5); (3, 14)]
4m[2] = 7; // [(1, 5); (2, 7); (3, 14)]
5m[0] = -1; // [(0, -1); (1, 5); (2, 7); (3, 14)]
6m.erase(2); // [(0, -1); (1, 5); (3, 14)]
7cout << m[1] << '\n'; // 5
8cout << m.count(7) << '\n' ; // 0
9cout << m.count(1) << '\n' ; // 1

Java

In a TreeMap, the put(key, value) method assigns a value to a key and places the key and value pair into the map. The get(key) method returns the value associated with the key. The containsKey(key) method checks whether a key exists in the map. Lastly, remove(key) removes the map entry associated with the specified key. All of these operations are O(1)O(1), but again, due to the hashing, this has a high constant factor.

1Map<Integer, Integer> map = new TreeMap<Integer, Integer>();
2map.put(1, 5); // [(1, 5)]
3map.put(3, 14); // [(1, 5); (3, 14)]
4map.put(2, 7); // [(1, 5); (2, 7); (3, 14)]
5map.remove(2); // [(1, 5); (3, 14)]
6System.out.println(map.get(1)); // 5
7System.out.println(map.containsKey(7)); // false
8System.out.println(map.containsKey(1)); // true

Python

Colloquially, maps are referred to as dicts in python. They act as hash maps, so they actually have O(1)O(1) insertion, deletion, and searches.

1dict = {}
2dict[1] = 5 # [(1, 5)]
3dict[3] = 14 # [(1, 5); (3, 14)]
4dict[2] = 7 # [(1, 5); (2, 7); (3, 14)]
5del dict[2] # [(1, 5); (3, 14)]
6print(dict[1]) # 5
7print(7 in dict.keys()) # False
8print(1 in dict.keys()) # True

Iterating Over Maps

C++

To iterate over maps, you can use a for loop.

1for (pair<int,int> x : m) {
2 cout << x.first << " " << x.second << '\n';
3}
4for (auto x : m) {
5 cout << x.first << " " << x.second << '\n';
6}
7
8/* both output the following:
90 -1
101 5

The map stores pairs in the form {key, value}. The auto keyword suffices to iterate over any type of pair. You can use these pairs normally, as introduced in this module.

Additionally, you can pass by reference when iterating over a map, like this:

1for (auto& x : m) {
2 x.second = 3;
3}
4for (pair<int,int> x : m) {
5 cout << x.first << " " << x.second << '\n';
6}
7
8/*
90 3
101 3

This allows you to modify the values of the pairs stored in the map.

Java

To iterate over maps, you can use a for loop over the keys.

1for (int k : m.keySet()){
2 System.out.println(k + " " + m.get(k));
3}

Python

To iterate over dicts, you can use a for loop over the keys of the dict.

1for key in dict.keys():
2 print(f'{key} {dict[key]}')

Inserting / Deleting Keys While Iterating

While you are free to change the values in a map when iterating over it (as demonstrated above), be careful about inserting and deleting keys while iterating. This code will work:

1void iterate_insert() {
2 map<int,int> m;
3 for (int i = 0; i < 10; ++i) m[i] = i;
4 for (auto& p: m) {
5 if (p.f == 10) break;
6 p.s = 5;
7 m[p.f+1] = 0;
8 }
9 cout << "ENTRIES:\n";
10 for (pair<int,int> p: m)

However, consider the following code, which attempts to remove every third entry from a map.

1void iterate_remove_bad() {
2 map<int,int> m; for (int i = 0; i < 10; ++i) m[i] = i;
3 int cnt = 0;
4 for (auto it = begin(m); it != end(m); ++it) {
5 cout << "CURRENT KEY: " << it->f << "\n";
6 cnt ++;
7 if (cnt%3 == 0) m.erase(it);
8 }
9 cout << "REMAINING ENTRIES:\n";
10 for (pair<int,int> p: m)

However, we would expect the keys 2, 5, and 8 to be removed from the map, but this is not the case. 2 is correctly removed, but the next key removed is 4, not 5! And it seems that some keys are appearing more than once during the iteration.

As the documentation for erase mentions, "iterators, pointers and references referring to elements removed by the function are invalidated." So incrementing it after it has been erased from the map might not produce the intended result. If you're lucky, this will produce a segmentation fault. Unfortunately, sometimes (as in this case) the code will run without appearing to produce an error.

If we compile using -D_GLIBCXX_DEBUG and run the above, then

g++ -D_GLIBCXX_DEBUG whoops.cpp -o whoops && ./whoops

gives an error, as expected.

CURRENT KEY: 0
CURRENT KEY: 1
CURRENT KEY: 2
/usr/local/Cellar/gcc/10.1.0/include/c++/10.1.0/debug/safe_iterator.h:328:
In function:
    __gnu_debug::_Safe_iterator<_Iterator, _Sequence, _Category>& 
    __gnu_debug::_Safe_iterator<_Iterator, _Sequence, 
    _Category>::operator++() [with _Iterator = 
    std::_Rb_tree_iterator<std::pair<const int, int> >; _Sequence = 
    std::__debug::map<int, int>; _Category = std::forward_iterator_tag]

Error: attempt to increment a singular iterator.

Objects involved in the operation:
    iterator "this" @ 0x0x7ffee963c870 {
      type = std::_Rb_tree_iterator<std::pair<int const, int> > (mutable iterator);
      state = singular;
      references sequence with type 'std::__debug::map<int, int, std::less<int>, std::allocator<std::pair<int const, int> > >' @ 0x0x7ffee963c8b0
    }
zsh: abort      ./whoops

Similarly, in Java,

If the map is modified while an iteration over the collection is in progress (except through the iterator's own remove operation), the results of the iteration are undefined.

As suggested by this StackOverflow post, the following code produces the intended results.

1void iterate_remove_ok() {
2 map<int,int> m; for (int i = 0; i < 10; ++i) m[i] = i;
3 int cnt = 0;
4 for (auto it = begin(m), next_it = it; it != end(m); it = next_it) {
5 ++next_it;
6 cout << "CURRENT KEY: " << it->f << "\n";
7 ++cnt;
8 if (cnt%3 == 0) {
9 m.erase(it);
10 }

You could also just create a new map instead of removing from the old one.

1void iterate_remove_ok_2() {
2 map<int,int> m, M; for (int i = 0; i < 10; ++i) m[i] = i;
3 int cnt = 0;
4 for (pair<int,int> p: m) {
5 ++cnt;
6 if (cnt%3 != 0) M[p.f] = p.s;
7 }
8 swap(m,M);
9 cout << "REMAINING ENTRIES:\n";
10 for (pair<int,int> p: m)

Problems

Some of these problems can be solved by sorting alone.

StatusSourceProblem NameDifficultyTagsSolution
BronzeEasyExternal Sol
CSESEasy
BronzeEasyExternal Sol
BronzeNormalExternal Sol
SilverNormal

Module Progress:

Give Us Feedback on Introduction to Sets & Maps!

PrevNext