Understanding Vulnerabilities of Unordered Map

Unordered map is presumed to have O(1) time complexity per operation, assuming minimal collisions. The assumption is valid for random input data but becomes precarious with non-random or adversarial inputs during hacking attempts. If the hash function is known, attackers can craft inputs causing O(n^2) collisions, compromising the expected constant-time complexity of unordered map operations.

Optimization for Unordered Map O(1) operations

C++ provides convenient data structures like std::set and std::map, tree-based structures with operations taking O(log n) time. With C++11, hash-based alternatives were introduced: std::unordered_set and std::unordered_map. However, concerns arise regarding potential vulnerabilities and efficiency in real-world scenarios. In this article, we’re going to explore strategies to optimize unordered maps for O(1) operations and address security issues.

Table of Content

  • Implementation of Unordered Map
  • Understanding Vulnerabilities of Unordered Map
  • Vulnerable Implementation for Unordered Map
  • Secure Unordered Map from Collision Attacks
  • Better Implementation of Unordered Map

Similar Reads

Implementation of Unordered Map:

To comprehend the vulnerabilities, it is very important to know the implementation details of std::unordered_map. The GCC implementation reveals the use of _Mod_range_hashing and _Prime_rehash_policy. The hash function involves hashing the input and applying modulo with a prime number to determine the position in the hash table. The _Prime_rehash_policy contains a list of primes stored in __prime_list. These prime numbers are used to modulo the input and get the correct position in the hash table. Also, before taking the modulo of the input with these prime numbers, Unordered map uses a hash funtion to hash the input values. The hash used by Unordered map is std::hash which is simply the identity function for integers. Depending on the compiler version, specific primes trigger resizing during operations. For instance, 126271 works for GCC 6 or earlier, and 107897 is suitable for GCC 7 or later....

Understanding Vulnerabilities of Unordered Map:

Unordered map is presumed to have O(1) time complexity per operation, assuming minimal collisions. The assumption is valid for random input data but becomes precarious with non-random or adversarial inputs during hacking attempts. If the hash function is known, attackers can craft inputs causing O(n^2) collisions, compromising the expected constant-time complexity of unordered map operations....

Vulnerable Implementation for Unordered Map:

A simple code snippet illustrates the vulnerability by measuring execution time for different primes:...

Secure Unordered Map from Collision Attacks:

To secure unordered map from collision attacks, a custom hash function can be used. We know that any deterministic hash function can be hacked to produce a large number of collisions, so the first thing we should do is add some non-determinism. The standard hash function can be modified to introduce non-determinism, making it harder to hack:...

Better Implementation of Unordered Map:

The custom hash function can be integrated into the unordered map or other hash tables like gp_hash_table. Below is a better implementation of the unordered map which almost guarantees O(1) time for all operations like Insertion, deletion, etc....

Conclusion:

Making unordered map operations efficient (O(1)) requires dealing with security issues and creating strong hash functions. By ensuring security and adding randomness, developers can use unordered map with confidence, achieving good performance and avoiding vulnerabilities to hacks....