Syllabus
Lab Component:
Syllabus
(In Python) and Use Kaggle Using Pandas Data frames Visualization and plots – seaborn
Data Preparation – Cleaning – Missing data, Data Reduction – PCA, Data Transformation – Normalization, Binning, distance measures, similarity
Association mining Regression – Linear
Naïve Bayes Classifier, Decision tree, KNN KMeans, Hierarchical clustering
Unit I
Introduction: Introduction to Data Mining-Types of Data and Patterns Mined- Technologies- Applications-Major Issues in Data Mining. Introduction to Data Warehousing: Basic Concepts and Techniques
Unit II
Knowing about Data-Data Preprocessing: Cleaning–Integration Reduction–Data Transformation and Discretization.
Unit III
Mining Frequent Patterns: Basic Concept – Frequent Item Set Mining Methods -Apriori and FP Growth algorithms -Mining Association Rules
Unit IV
Classification and Predication: Issues – Algorithms- Decision Tree Induction – Bayesian Classification –k Nearest Neighbor- Prediction – Accuracy- Precision and Recall
Unit V
Clustering: Overview of Clustering – Types of Data in Cluster Analysis – K Means and K Medoid, Hierarchical Clustering Algorithms
Text Books / References
TEXTBOOKS / REFERENCES:
1) Jiawei Han, MichelineKamber and Jian Pei, “Data mining concepts and Techniques”, Third Edition, Elsevier Publisher, 2006.
2) K.P.Soman, ShyamDiwakar and V.Ajay, “Insight into data mining Theory and Practice”, Prentice Hall of India, 2006.
3) William H Inmon “Building the Data Warehouse”, Wiley, Fourth Edition 2005.