Unit 1
Introduction to Big Data Analytics: Definition, characteristics, and importance of big data, tools and technologies for big data analytics, State-of-the-art computing paradigms/platforms, Hadoop ecosystem in Brief, Mapper, Reducer.
Course Name | Big Data Analytics |
Course Code | 23AID302 |
Program | B.Tech in Artificial Intelligence and Data Science |
Semester | 5 |
Credits | 3 |
Campus | Coimbatore , Amritapuri ,Faridabad , Bangaluru, Amaravati |
Introduction to Big Data Analytics: Definition, characteristics, and importance of big data, tools and technologies for big data analytics, State-of-the-art computing paradigms/platforms, Hadoop ecosystem in Brief, Mapper, Reducer.
Introduction to Functional Programming (FP), FP concepts in Scala Programming, Mutable and Immutable Data structures, Scala Collections, Type Hierarchy, Higher Order Functions, Closures, ConsList, Tail Recurrsion, Object Oriented Programming in Scala, Introduction to concurrency
Basic entity classes and objects in Scala, Spark Architecture, Spark Cluster, Resilient Distributed Datasets (RDDs), Spark Transformations and Actions APIs, DataFrames and Datasets in Spark, Basic Operations on RDDs and DataFrames, lazy evolutions and optimization, Directed Acyclic Graph (DAG)
Introduction to Machine Learning with Spark, MLlib and its algorithms, Building a Machine Learning Pipeline in Spark, Case Study in Healthcare, Finance, etc.
Course Objectives
Course Outcomes
After completing this course, students will be able to
CO1 |
Implement functional and object-oriented programs in Scala, including using higher-order functions, pattern matching, and type classes |
CO2 |
Create and maintain a Spark deployment, including cluster configuration, resource allocation, and job monitoring |
CO3 |
Deploy of Spark for various use cases, such as ETL, data warehousing, and real-time analytics. |
CO4 |
Analyze real-world data sets and extract meaningful insights using statistical and machine learning techniques |
CO-PO Mapping
PO/PSO |
PO1 |
PO2 |
PO3 |
PO4 |
PO5 |
PO6 |
PO7 |
PO8 |
PO9 |
PO10 |
PO11 |
PO12 |
PSO1 |
PSO2 |
PSO3 |
CO |
|||||||||||||||
CO1 |
3 |
3 |
2 |
2 |
3 |
– |
– |
– |
3 |
2 |
3 |
3 |
– |
– |
– |
CO2 |
3 |
3 |
3 |
3 |
3 |
– |
– |
– |
3 |
2 |
3 |
3 |
– |
– |
– |
CO3 |
3 |
2 |
3 |
3 |
3 |
– |
– |
– |
3 |
2 |
3 |
3 |
– |
– |
– |
CO4 |
3 |
3 |
3 |
2 |
3 |
– |
– |
– |
3 |
2 |
3 |
3 |
– |
– |
– |
Evaluation Pattern
Assessment |
Internal/External |
Weightage (%) |
Assignments (Minimum 3) |
Internal |
30 |
Quiz(Minimum 2) |
Internal |
20 |
Mid-Term Examination |
Internal |
20 |
Term project/End semester examination |
External |
30 |
Text Books / References
‘Learning Spark: Lightning-Fast Big Data Analysis’, Holden Karau , Andy Konwinski, Patrick Wendell and MateiZaharia, O′Reilly; 1st edition , 2015
‘Programming in Scala: A Comprehensive Step-by-Step Guide’, Martin Odersky,Lex Spoon andBill Venners, Artima Inc; Version ed. edition , 2008
‘High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark’, Holden Karau, Rachel Warren, O′Reilly; 1st edition, 2017
‘Scala for the Impatient’, Cay S. Horstmann, Addison-Wesley; 2nd edition, 2017
DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.