Syllabus - Data Analytics (CS-503 (A))


Computer Science and Engineering

Data Analytics (CS-503 (A))

V-Semester

UNIT-I

DESCRIPTIVE STATISTICS

Probability Distributions, Inferential Statistics ,Inferential Statistics through hypothesis tests Regression & ANOVA ,Regression ANOVA(Analysis of Variance)

UNIT-II

INTRODUCTION TO BIG DATA

Big Data and its Importance, Four V’s of Big Data, Drivers for Big Data, Introduction to Big Data Analytics, Big Data Analytics applications.

BIG DATA TECHNOLOGIES

Hadoop’s Parallel World, Data discovery, Open source technology for Big Data Analytics, cloud and Big Data, Predictive Analytics, Mobile Business Intelligence and Big Data, Crowd Sourcing Analytics, Inter- and Trans-Firewall Analytics, Information Management.

UNIT-III

PROCESSING BIG DATA

Integrating disparate data stores, Mapping data to the programming framework, Connecting and extracting data from storage, Transforming data for processing, subdividing data in preparation for Hadoop Map Reduce.

UNIT-IV

HADOOP MAPREDUCE

Employing Hadoop Map Reduce, Creating the components of Hadoop Map Reduce jobs, Distributing data processing across server farms, Executing Hadoop Map Reduce jobs, monitoring the progress of job flows, The Building Blocks of Hadoop Map Reduce Distinguishing Hadoop daemons, Investigating the Hadoop Distributed File System Selecting appropriate execution modes: local, pseudo-distributed, fully distributed.

UNIT-V

BIG DATA TOOLS AND TECHNIQUES

Installing and Running Pig, Comparison with Databases, Pig Latin, User- Define Functions, Data Processing Operators, Installing and Running Hive, Hive QL, Querying Data, User-Defined Functions, Oracle Big Data.

Practicals

Reference Books

  • Michael Minelli, Michehe Chambers, “Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Business”, 1st Edition, Ambiga Dhiraj, Wiely CIO Series, 2013.

  • Arvind Sathi, “Big Data Analytics: Disruptive Technologies for Changing the Game”, 1st Edition, IBM Corporation, 2012.1. Rajaraman, A., Ullman, J. D., Mining of Massive Datasets, Cambridge University Press, United Kingdom, 2012

  • Berman, J.J., Principles of Big Data: Preparing, Sharing and Analyzing Complex Information, Morgan Kaufmann, 2014

  • Barlow, M., Real-Time Big Data Analytics: Emerging Architecture, O Reilly, 2013

  • Schonberger, V.M. , Kenneth Cukier, K., Big Data, John Murray Publishers, 2013

  • Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, 1st Edition, Wiley and SAS Business Series, 2012.