Back close

Course Detail

Course Name Data Mining
Course Code 24ASD513
Program M.Sc. in Applied Statistics and Data Analytics
Semester 2
Credits 4
Campus Coimbatore , Kochi

Syllabus

Unit I

Introduction to Data Mining: Introduction, What is Data Mining, Definition, KDD, Challenges, Data Mining Tasks, Data Preprocessing, Data Cleaning, Missing data, Dimensionality Reduction, Feature Subset Selection, Discretization and Binaryzation, Data Transformation; Measures of Similarity and Dissimilarity- Basics.

Unit II

Association Rules: Problem Definition, Frequent Item Set Generation, The APRIORI Principle, Support and Confidence Measures, Association Rule Generation; APRIOIRI Algorithm. Bayesian Belief Networks and Additional Topics Regarding Classification.

Unit III

Clustering: Problem Definition, Clustering Overview, Evaluation of Clustering Algorithms, Partitioning Clustering-K-Means Algorithm, K-Means Additional issues, PAM Algorithm; Hierarchical Clustering-Agglomerative Methods and divisive methods, Key Issues in Hierarchical Clustering, Strengths and Weakness.

Unit IV

Outlier Detection: Outliers and Outlier Analysis -What Are Outliers?, Types of Outliers ,Challenges of Outlier Detection, Outlier Detection Methods, Statistical Approaches, Parametric Methods, Nonparametric Methods, Proximity-Based Approaches, Clustering-Based Approaches, Classification-Based Approaches, Mining Contextual and Collective Outliers.

Unit V

Dimensionality Reduction: Principal-Component Analysis, Singular-Value Decomposition, and CUR Decomposition. Link Analysis: Page Rank, Efficient Computation of Page Rank, Topic-Sensitive Page Rank, Link Spam, Hubs and Authorities. Recommendation Systems: A Model for Recommendation Systems, Content-Based Recommendations, and the Netflix Challenge.

Objectives and Outcomes

Course Outcomes:

CO1: Familiarize data mining basic concepts and understand association rule mining.

CO2: Learn to implement clustering techniques on unsupervised data

CO3: Implementing various approaches for dealing with outliers

CO4: Capable of implementing dimensionality reduction techniques on massive datasets

CO5: Understanding the working process of recommendation systems

CO-PO Mapping:

 

PO1

PO2

PO3

PO4

PO5

PO5

PO6

PO7

PO8

PO9

PO10

PO11

PO12

CO1

2

3

2

2

2

2

2

       

2

2

CO2

2

2

2

2

2

2

2

       

2

2

CO3

2

2

2

2

2

3

2

       

2

2

CO4

2

2

1

2

2

2

2

       

1

2

CO5

1

2

1

1

1

2

2

       

1

2

Text Books / References

Text Books/ Reference Books and Websites:

  1. Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier.
  2. Rajaraman, A., & Ullman, J. D. (2011). Mining of massive datasets. Cambridge University Press.
  1. https://nptel.ac.in/courses/106/105/106105174/
  2. https://nptel.ac.in/content/storage2/nptel_data3/html/mhrd/ict/text/110105083/lec52.pdf
  3. Ngo, T. (2011). Data mining: practical machine learning tools and technique, by ian h. witten, eibe frank, mark a. hell. ACM SIGSOFT Software Engineering Notes, 36(5), 51-52.

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now