Back close

Course Detail

Course Name Big Data Analytics
Course Code 24BUS377
Program BBA (Bachelor of Business Administration)
Credits 3
Campus Mysuru

Syllabus

Discipline Specific Electives: Business Analytics

Unit 1

Introduction to Big Data and Big Data Programming Models – Massively Parallel Processing (MPP) Database Systems – In-Memory Database Systems – MapReduce Systems – Bulk Synchronous Parallel (BSP) Systems, Big Data and Transactional Systems, Scaling of Database

Unit 2

Introduction to Hadoop, Components of Hadoop – Hadoop Distributed File System (HDFS), Hadoop 3.0 – Components of YARN , HDFS High Availability, Hadoop Program : Word Count in local mode versus cluster mode, Hadoop Administration : Hadoop Configuration Files, Configuring Hadoop Daemons, Precedence of Hadoop Configuration Files, Cluster Administration Utilities, Command Line HDFS Administration, Rebalancing HDFS Data – Copying Large Amounts of Data from the HDFS, Components of a MapReduce program, Basics of MapReduce Development : Hadoop and Data Processing, Working with large Datasets : Preparing the Development Environment – Preparing the Hadoop System – Word Count Implementation using map reduce – Introduction to Hadoop I/O, Hadoop Input/Output : Compression Schemes : What Can Be Compressed? – Compression Schemes

Hadoop in the Cloud – Economics – Self-Hosted Cluster – Cloud-Hosted Cluster – Elasticity – On Demand – Bid Pricing- Hybrid Cloud – Logistics Ingress/Egress – Data Retention – Security

– Cloud Usage Models – Cloud Providers – Amazon Web Services, Microsoft Azure – Choosing a Cloud Vendor – Case Study: Amazon Web Services – Elastic MapReduce – Elastic Compute Cloud

Unit 3

HBase, Architecture and role of HBase, HBase schema design, Basic programming for HBase, Combining the capabilities of HBase and HDFS, Log file Analysis.

Unit 4

Hive Architecture and Concepts, Data Definition Language, Data Manipulation Language, External Interfaces, Hive Scripts – Performance, MapReduce Integration, Creating Partitions – User- HiveQL Compiler Details.

Unit 5

Data Processing Using Pig: An Introduction to Pig, Running Pig, executing a Pig Script – Embedded Java Program, Pig Latin: Comments in a Pig Script – Execution of Pig Statements – Pig Commands, User-Defined Functions: Eval Functions Invoked in the Mapper – Eval Functions Invoked in the Reducer – Writing and Using a Custom Interfund, Comparison of PIG versus Hive, Understanding Automated Data processing with Oozie

Objectives and Outcomes

Objective:

To Expose to Big Data Technologies and Environment. Course Outcomes:

CO1: To gain knowledge on Big Data Technologies. CO2: To understand the framework of Big Data.

CO3: Ability to interact with Big Data Environment and analysis the data. CO4: Knowledge on various tools related to Big Data Analysis.

CO/PO PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 3 2 3 2 1 1 2 2 2 3 3 2
CO2 3 3 3 2 2 1 2 2 2 3 3 2
CO3 2 3 3 2 2 2 2 2 2 3 3 3
CO4 3 2 3 2 1 1 2 2 2 3 3 2

Text Books / References

TEXTBOOK:

  • Pro Apache Hadoop, 2nd Edition, Jason Venner, Sameer Wadkar, and Madhu Siddalingaiah
  • Big Data Analytics with R and Hadoop, Vignesh Prajapati

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now