Syllabus - Introduction to Toolkits for Data Science (CD503 (C))
CSE-Data Science/Data Science
Introduction to Toolkits for Data Science (CD503 (C))
V semester
Unit 1
Python for Data Science
Review of Numpy, Pandas and Scikit-learn.Supervised Learning Techniques packages/toolkit for regression and classification: - Decision Trees, Naive Bayes, Classification, Support vector machines, Random Forest, Neural network, Ensemble Methods, Ordinary Least Squares Regression, Logistic Regression, etc. Unsupervised Learning, Clustering: k-means, adaptive hierarchical clustering, Gaussian mixture, Optimization Using Evolutionary Techniques etc.
Unit 2
R for Data Science
Basic of R and RStudio. R data structures: vectors, factors, lists, arrays, matrices, and data frames. Working with data: Import data into R and visualize data. Data Analytics Software: Weka, Orange, Rapidminer, Minitab, PowerBI, GitHub, Google Colab.
Unit 3
Introduction to Deep Learning
Basics of TensorFlow and keras,Basics of PyTorch, perform style transfer of one image to another, Perform text generation, and sentiment analysis with PyTorch. Neural networks that recognize objects, improve the accuracy of object recognition using CNN, use pre-trained models to build state-of-the-art classifiers, Saving and Loading models, Time series forecasting with RNNs, and LSTMs,
Unit 4
Introduction to Time Series Analysis
Time series regression and exploratory data analysis toolkits: ARMA/ARIMA models, model identification/estimation/linear operators, Fourier analysis, spectral estimation, and state-space models.
Unit 5
Cloud Computing for Data Science
Implementation of Machine Learning and Deep learning through AWS/Azure platform. Version controlling tools for data science projects. Case studies of data science projects.
Practicals
Reference Books
-
Brockwell& Davis (2016) Introduction to Time Series and Forecasting, 3rd edition, Springer
-
Cryer& Chan (2008) Time-Series Analysis with Applications in R, Springer
-
Prado & West (2010) Time Series: Modeling, Computation, and Inference Chapman & Hall
-
Petris, Petrone, Campagnoli (2009) Dynamic Linear Models with R, Springer
-
Ruppert& Matteson (2016) Statistics and Data Analysis for Financial Engineering with R examples, 2nd Edition, Springer
-
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, 1st Edition, O’reilly publication.