Syllabus - Data Mining (IT 603(B))


Information Technology

Data Mining (IT 603(B))

VI

Unit I

Data Warehousing

Need for data warehousing , Basic elements of data warehousing, Data Mart, Data Warehouse Architecture, extract and load Process, Clean and Transform data, Star ,Snowflake and Galaxy Schemas for Multidimensional databases, Fact and dimension data, Partitioning Strategy-Horizontal and Vertical Partitioning, Data Warehouse and OLAP technology, Multidimensional data models and different OLAP Operations, MOLAP, Data Warehouse implementation, Efficient OLAPServer: Computation of Data Cubes, Processing of OLAP queries, Indexing data. ROLAP

Unit II

Data Mining

Data Preprocessing, Data Integration and Transformation, Data Reduction, Discretizaion and Concept Hierarchy Generation, Basics of data mining, Data mining techniques, KDP (Knowledge Discovery Process), Application and Challenges of Data Mining

Unit III

Mining Association Rules in Large Databases

Association Rule Mining, Single-Dimensional Boolean Association Rules, Multi-Level Association Rule, Apriori Algorithm, Fp- Growth Algorithm, Time series mining association rules, latest trends in association rules mining.

Unit IV

Classification and Clustering

Distance Measures, Types of Clustering Algorithms, K-Means Algorithm, Decision Tree, Bayesian Classification, Other Classification Methods, Prediction, Classifier Accuracy, Categorization of methods, Outlier Analysis.

Unit V

Introduction of Web Mining and its types, Spatial Mining, Temporal Mining, Text Mining, Security Issue, Privacy Issue, Ethical Issue.

Course Objective

To introduce data warehouse and its components, To introduce knowledge discovery process, data mining and its functionalities, To develop understanding of various algorithms for association rule mining and their differences, To introduce various classification techniques, To introduce various clustering algorithms.

Course Outcome

Demonstrate an understanding of the importance of data warehousing and OLAP technology, Organize and Prepare the data needed for data mining using pre preprocessing techniques, Implement the appropriate data mining methods like classification, clustering or Frequent Pattern mining on various data sets, Define and apply metrics to measure the performance of various data mining algorithms, Demonstrate an understanding of data mining on various types of data like web data and spatial data

Practicals

Reference Books

  • Arun k Pujari “Data Mining Technique” University Press

  • Han,Kamber, “Data Mining Concepts & Techniques”

  • M.Kaufman., P.Ponnian, “Data Warehousing Fundamentals”

  • M.H.Dunham, “Data Mining Introductory & Advanced Topics”

  • Ralph Kimball, “The Data Warehouse Lifecycle Tool Kit”

  • E.G. Mallach , “The Decision Support & Data Warehouse Systems”