Explain Hashing procedure. State the properties of good hashing functions.


Q.) Explain Hashing procedure. State the properties of good hashing functions.

Subject: data structure

Hashing Procedure

Hashing is a technique used in computer science to transform a large set of data into a smaller set of data. This is done by applying a hash function to each element of the large set, which produces a unique identifier for that element. The hash function is designed to distribute the elements evenly across the range of possible values, so that the resulting set is much smaller than the original set.

The hashing procedure can be summarized as follows:

  1. Choose a hash function $h: X \rightarrow Y$, where $X$ is the set of possible input values and $Y$ is the set of possible output values.
  2. For each element $x$ in the input set $X$, compute the hash value $h(x)$.
  3. Store the hash values in a hash table, which is a data structure that maps hash values to the corresponding input values.

Properties of Good Hashing Functions

A good hashing function should have the following properties:

  • Uniformity: The hash function should distribute the elements of the input set evenly across the range of possible output values. This means that each output value should be equally likely to be the hash value of any given input value.
  • Collision-resistance: The hash function should be resistant to collisions, which occur when two different input values produce the same hash value. The more resistant a hash function is to collisions, the less likely it is that two different elements of the input set will be stored in the same location in the hash table.
  • Efficiency: The hash function should be efficient to compute. This is important because the hash function will be applied to every element of the input set, so a slow hash function will slow down the overall hashing procedure.

Common Hashing Functions

There are many different hashing functions that can be used, each with its own advantages and disadvantages. Some of the most common hashing functions include:

  • Division method: The division method simply divides the input value by a prime number and uses the remainder as the hash value. This method is easy to implement and is relatively uniform, but it is not very collision-resistant.
  • Multiplication method: The multiplication method multiplies the input value by a constant and then takes the fractional part of the result as the hash value. This method is more collision-resistant than the division method, but it is also more complex to implement.
  • Universal hashing: Universal hashing is a family of hash functions that are designed to be both uniform and collision-resistant. Universal hashing functions are more complex to implement than the division and multiplication methods, but they offer the best performance in terms of uniformity and collision-resistance.

Applications of Hashing

Hashing is used in a wide variety of applications, including:

  • Databases: Hashing is used to index data in databases, which allows for fast retrieval of data.
  • Caching: Hashing is used to store frequently accessed data in a cache, which improves the performance of applications.
  • Load balancing: Hashing is used to distribute traffic across multiple servers, which helps to improve the overall performance and reliability of a system.
  • Digital signatures: Hashing is used to create digital signatures, which can be used to verify the integrity of data.