Write brief notes on any four of the following:


Q.) Write brief notes on any four of the following:

Subject: energy environment and society
  1. Database Normalization:
  • Normalization is a process of organizing data in a database to reduce redundancy and improve data integrity.
  • It involves decomposing a table into multiple tables based on functional dependencies and referential integrity.
  • Normalization forms include:
    • First Normal Form (1NF): Each column in a table represents a single attribute.
    • Second Normal Form (2NF): Each non-key attribute is fully dependent on the primary key.
    • Third Normal Form (3NF): Each non-key attribute is non-transitively dependent on the primary key.
  • Normalization helps in:
    • Reducing data redundancy, which saves storage space.
    • Improving data integrity by minimizing data inconsistencies.
    • Facilitating data manipulation and querying.
  1. Data Warehousing:
  • A data warehouse is a central repository of data extracted from various sources to support decision-making processes.
  • It integrates data from multiple sources, cleanses it, and transforms it into a consistent format.
  • Data warehouses are often used for:
    • Business intelligence and reporting.
    • Data analysis and mining.
    • Historical analysis and forecasting.
  • Key characteristics of data warehouses include:
    • Subject-oriented: Data is organized based on business subjects rather than system structures.
    • Integrated: Data from different sources is integrated into a consistent format.
    • Time-variant: Data is stored over time, allowing historical analysis.
    • Non-volatile: Data is not updated or deleted frequently, ensuring data integrity.
  1. Hashing:
  • Hashing is a technique for mapping data to a fixed-size value, called a hash value or hash code.
  • It is used for indexing and searching data efficiently.
  • Hashing algorithms take an input value and produce a hash value through a mathematical function.
  • Common hashing algorithms include:
    • MD5 (Message Digest 5): Produces a 128-bit hash value.
    • SHA-1 (Secure Hash Algorithm 1): Produces a 160-bit hash value.
    • SHA-256 (Secure Hash Algorithm 256): Produces a 256-bit hash value.
  • Hashing is useful for:
    • Unique identification of data.
    • Fast data retrieval from a large dataset.
    • Ensuring data integrity by detecting changes.
  1. Clustering:
  • Clustering is a technique for grouping data into clusters based on their similarities.
  • It is used for data analysis, pattern recognition, and machine learning.
  • Clustering algorithms group data points based on various distance or similarity measures.
  • Common clustering algorithms include:
    • K-means clustering: Partitions data into a specified number of clusters.
    • Hierarchical clustering: Creates a hierarchy of clusters based on data similarity.
    • Density-based clustering: Finds clusters based on data density.
  • Clustering is useful for:
    • Identifying patterns and structures in data.
    • Reducing data dimensionality for efficient analysis.
    • Improving the performance of machine learning algorithms.