(a) What is hashing function? Explain any four hashing functions with examples. (b) Write notes on the following: (i) Overflow handling in hashing (ii) Static tree tables
Q.) (a) What is hashing function? Explain any four hashing functions with examples. (b) Write notes on the following: (i) Overflow handling in hashing (ii) Static tree tables
Subject: Data Structure and Algorithm(a) What is a Hashing Function?
A hashing function is an algorithm that takes an input (or 'message') and returns a fixed-size string of bytes. The output is typically a 'hash code' or 'hash value'. Hashing functions are designed to be a one-way function, which means it should be computationally infeasible to reverse the operation or find two different inputs that produce the same hash value (a condition known as a collision). Hashing functions are widely used in computer science for various applications such as data retrieval, cryptographic security, and data integrity verification.
Four Hashing Functions with Examples:
MD5 (Message Digest Algorithm 5):
- Purpose: Originally designed for cryptographic purposes but now considered broken and unsuitable for further use because it's vulnerable to hash collisions.
- Output Size: 128-bit hash value.
- Example:
plaintext Input: "Hello, World!" MD5 Hash: "65a8e27d8879283831b664bd8b7f0ad4"
SHA-1 (Secure Hash Algorithm 1):
- Purpose: Designed by the National Security Agency (NSA) to be part of the Digital Signature Algorithm. Like MD5, SHA-1 is also no longer considered secure against well-funded attackers.
- Output Size: 160-bit hash value.
- Example:
plaintext Input: "Hello, World!" SHA-1 Hash: "2ef7bde608ce5404e97d5f042f95f89f1c232871"
SHA-256 (Secure Hash Algorithm 256-bit):
- Purpose: Part of the SHA-2 family, designed by the NSA, and widely used for secure applications including in blockchain and SSL certificates.
- Output Size: 256-bit hash value.
- Example:
plaintext Input: "Hello, World!" SHA-256 Hash: "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"
CRC32 (Cyclic Redundancy Check):
- Purpose: A checksum hashing function used to detect accidental changes to raw data in digital networks and storage devices.
- Output Size: 32-bit hash value.
- Example:
plaintext Input: "Hello, World!" CRC32 Hash: "1c291ca3"
(b) Notes on the following:
(i) Overflow Handling in Hashing
Overflow occurs in hashing when a hash function generates the same index for more than one key, which is known as a collision. Handling collisions is critical to maintain the efficiency of a hash table. There are several methods to handle overflow:
Chaining:
- Each slot of the hash table contains a linked list of entries that hash to the same index.
- When a collision occurs, the item is added to the list at that index.
Open Addressing:
- All elements are stored in the hash table itself.
- When a collision occurs, the table is probed sequentially or by another hashing function to find an empty slot.
Double Hashing:
- A form of open addressing with a secondary hash function.
- If a collision occurs, the secondary hash function is used to calculate the interval between probes.
Rehashing:
- The hash table is resized and all existing items are rehashed and inserted into the new table.
- This is often done when the load factor of the table exceeds a certain threshold.
(ii) Static Tree Tables
Static tree tables refer to a data structure that uses a tree-like model to organize static data in a hierarchical manner. The term 'static' implies that the structure of the tree and the data it contains do not change over time. Binary search trees, AVL trees, and B-trees are examples of static tree tables.
Characteristics of Static Tree Tables:
- Ordered: The tree maintains data in a sorted order, allowing for efficient search operations.
- Balanced: Some static trees (like AVL trees) maintain balance to ensure that operations such as insertions, deletions, and searches can be performed in logarithmic time.
- Immutable: Once constructed, the structure does not change, which simplifies the design and can improve performance for read-only access patterns.
Applications of Static Tree Tables:
- Database Indexing: Used to index records for efficient retrieval.
- File Systems: The directory structure can be represented as a static tree.
- Lookup Tables: Used in applications where fast lookup of static data is required.
Static tree tables are particularly useful when the dataset does not change, or changes infrequently, and when there is a need for fast search and retrieval operations.