Hashing in Data Structure
Hashing is a fundamental data structure that efficiently stores and retrieves data in a way that allows for quick access. It involves mapping data to a specific index in a hash table using a hash function that enables fast retrieval of information based on its key. This method is commonly used in databases, caching systems, and various programming applications to optimize search and retrieval operations.
Table of Content
- What is Hashing in Data Structure?
- Hash Table in Data Structure
- Hash Function
- What is a Hash Collision?
- Collision Resolution Techniques
- Applications of Hashing
- Basics of Hashing in Data Structure
- Easy Problem on Hashing
- Medium Problem on Hashing
- Hard Problem on Hashing
What is Hashing in Data Structure?
Hashing is a technique used in data structures to store and retrieve data efficiently. It involves using a hash function to map data items to a fixed-size array which is called a hash table.
How it works:
- Hash Function: You provide your data items into the hash function.
- Hash Code: The hash function crunches the data and give a unique hash code.
- Hash Table: The hash code then points you to a specific location within the hash table.
Hash Table in Data Structure
A hash table is also known as a hash map. It is a data structure that stores key-value pairs. It uses a hash function to map keys to a fixed-size array, called a hash table. This allows in faster search, insertion, and deletion operations.
Hash Function
The hash function is a function that takes a key and returns an index into the hash table. The goal of a hash function is to distribute keys evenly across the hash table, minimizing collisions (when two keys map to the same index).
Common hash functions include:
- Division Method: Key % Hash Table Size
- Multiplication Method: (Key * Constant) % Hash Table Size
- Universal Hashing: A family of hash functions designed to minimize collisions
What is a Hash Collision?
A hash collision occurs when two different keys map to the same index in a hash table. This can happen even with a good hash function, especially if the hash table is full or the keys are similar.
Causes of Hash Collisions:
- Poor Hash Function: A hash function that does not distribute keys evenly across the hash table can lead to more collisions.
- High Load Factor: A high load factor (ratio of keys to hash table size) increases the probability of collisions.
- Similar Keys: Keys that are similar in value or structure are more likely to collide.
Collision Resolution Techniques
There are two types of collision resolution techniques:
- Open Addressing:
- Linear Probing: Search for an empty slot sequentially
- Quadratic Probing: Search for an empty slot using a quadratic function
- Closed Addressing:
- Chaining: Store colliding keys in a linked list or binary search tree at each index
- Cuckoo Hashing: Use multiple hash functions to distribute keys
Applications of Hashing
Hash tables are used in a wide variety of applications, including:
- Databases: Storing and retrieving data based on unique keys
- Caching: Storing frequently accessed data for faster retrieval
- Symbol Tables: Mapping identifiers to their values in programming languages
- Network Routing: Determining the best path for data packets
Basics of Hashing in Data Structure
Easy Problem on Hashing
- Find whether an array is subset of another array
- Union and Intersection of two linked lists
- Given an array A[] and a number x, check for pair in A[] with sum as x
- Maximum distance between two occurrences of same element in array
- Count maximum points on same line
- Most frequent element in an array
- Find the only repetitive element between 1 to n-1
- How to check if two given sets are disjoint?
- Non-overlapping sum of two sets
- Check if two arrays are equal or not
- Find missing elements of a range
- Minimum number of subsets with distinct elements
- Remove minimum number of elements such that no common element exist in both array
- Find pairs with given sum such that elements of pair are in different rows
- Count pairs with given sum
- Count quadruples from four sorted arrays whose sum is equal to a given value x
- Sort elements by frequency
- Find all pairs (a, b) in an array such that a % b = k
- Group words with same set of characters
- k-th distinct (or non-repeating) element in an array.
Medium Problem on Hashing
- Find Itinerary from a given list of tickets
- Find number of Employees Under every Employee
- Longest subarray with sum divisible by k
- Find the largest subarray with 0 sum
- Longest Increasing consecutive subsequence
- Count distinct elements in every window of size k
- Design a data structure that supports insert, delete, search and getRandom in constant time
- Find subarray with given sum | Set 2 (Handles Negative Numbers)
- Implementing our Own Hash Table with Separate Chaining in Java
- Implementing own Hash Table with Open Addressing Linear Probing in C++
- Minimum insertions to form a palindrome with permutations allowed
- Maximum possible difference of two subsets of an array
- Sorting using trivial hash function
- Smallest subarray with k distinct numbers
Hard Problem on Hashing
- Clone a Binary Tree with Random Pointers
- Largest subarray with equal number of 0s and 1s
- All unique triplets that sum up to a given value
- Palindrome Substring Queries
- Range Queries for Frequencies of array elements
- Elements to be added so that all elements of a range are present in array
- Cuckoo Hashing – Worst case O(1) Lookup!
- Count subarrays having total distinct elements same as original array
- Maximum array from two given arrays keeping order same
- Find Sum of all unique sub-array sum for a given array.
- Recaman’s sequence
- Length of longest strict bitonic subsequence
- Find All Duplicate Subtrees
- Find if there is a rectangle in binary matrix with corners as 1
Quick Links :
- ‘Practice Problems’ on Hashing
- Top 20 Hashing Technique based Interview Questions
- ‘Quizzes’ on Hashing
- ‘Videos’ on Hashing
Recommended: