Syllabus - Data Mining & Warehousing (CD503 (A))


CSE-Data Science/Data Science

Data Mining & Warehousing (CD503 (A))

V semester

Unit 1

Data Warehousing

Introduction, Delivery Process, Data warehouse Architecture, Data Preprocessing: Data cleaning, Data Integration and transformation, Data reduction. Data warehouse Design: Dataware house schema, Partitioning strategy Data warehouse Implementation, Data Marts, Meta Data, Example of a Multidimensional Data model, Introduction to Pattern Warehousing.

Unit 2

OLAP Systems

Basic concepts, OLAP queries, Types of OLAP servers, OLAP operations etc. Data Warehouse Hardware and Operational Design: Security, Backup And Recovery.

Unit 3

Introduction to Data & Data Mining

Data Types, Quality of data, Data Preprocessing, Similarity measures, Summary statistics, Data distributions, Basic data mining tasks, Data Mining V/s knowledge discovery in databases. Issues in Data mining, Introduction to Fuzzy sets and fuzzy logic.

Unit 4

Supervised Learning (Classification)

Statistical-based algorithms, Distance-based algorithms, Decision tree-based algorithms, Neural network-based algorithms, Rule-based algorithms, Probabilistic Classifiers

Unit 5

Clustering & Association Rule mining

Hierarchical algorithms, Partitional algorithms, Clustering large databases – BIRCH, DBSCAN, CURE algorithms. Association rules : Parallel and distributed algorithms such as Apriori and FP growth algorithms.

Course Objective

Student should understand the value of Historical data and data mining in solving real-world problems. Student should become affluent with the basic Supervised and unsupervised learning algorithms commonly used in data mining. Student develops the skill in using data mining for solving real-world problems.

Course Outcome

["CO1. Understand the need of designing Enterprise data warehouses and will be enabled to approach business problems analytically by identifying opportunities to derive business.", "CO2. Compare and contrast various methods for storing & retrieving data from different data sources/repository.", "CO3. Ascertain the application of data mining in various areas and Preprocess the given data and visualize it for a given application or data exploration/mining task", "CO4. Apply supervised learning methods to given data sets such as classification and its various types.", "CO5. Apply Unsupervised learning methods to given data sets such as clustering and its various types.", "CO6. Apply Association rule Mining to various domains."]

Practicals

Reference Books

  • Pang – ningTan , Steinbach & Kumar, “Introduction to Data Mining”, Pearson Edu, 2019.

  • Jaiwei Han, MichelineKamber, “Data Mining : Concepts and Techniques”, Morgan Kaufmann Publishers.

  • Margaret H. Dunham, “Data Mining : Introductory and Advanced topics”, Pearson Edu., 2009.

  • Anahory& Murray, “Data Warehousing in the Real World”, Pearson Edu., 2009.