Course Syllabus
Concept of Machine Learning: Approaches to Modelling – Importance of Words in Documents – Hash Functions- Indexes – Secondary Storage -The Base of Natural Logarithms – Power Laws – MapReduce. Finding similar items: Shingling – LSH – Distance Measures. Mining Data Streams: Stream data model – Sampling data – Filtering streams. Link Analysis: Page Rank, Link Spam.
Frequent Item Sets: Market Basket Analysis, A-Priori Algorithm – PCY Algorithm, Clustering: Hierarchical clustering, K-Means, Clustering in Non-Euclidean Spaces, BFR, CURE. Recommendation Systems: Utility matrix – Content based – Collaborative filtering – UV Decomposition. Mining Social Network Graphs: Social networks as graphs–Clustering – Partitioning – Simrank. Dimensionality Reduction: Eigen Value Decomposition- PCA – SVD.
Large Scale Machine Learning: Neural Networks – The Support Vector Machines model and use of Kernels to produce separable data and non-linear classification boundaries. Overview – Deep learning; Tools for Data Ingestion; analytics and visualization.