Back close

Course Detail

Course Name Parallel and Distributed Data Management
Course Code 24AI741
Program M. Tech. in Artificial Intelligence
Credits 3
Campus Amritapuri ,Coimbatore

Syllabus

Introduction: Parallel and Distributed architectures, models, complexity measures, Communication aspects, A Taxonomy of Distributed Systems – Models of computation: shared memory and message passing systems, synchronous and asynchronous systems, Global state and snapshot algorithms.

Distributed and Parallel databases: Centralized versus Distributed Systems, Parallel versus Distributed Systems, Distributed Database Architectures-Shared disk, shared nothing, Distributed Database Design – Fragmentation and Allocation, Optimization.

Query Processing and Optimization – Parallel/ Distributed Sorting, Parallel/Distributed Join, Parallel/Distributed Aggregates, Network Partitions, Replication, Publish/Subscribe Systems-Case study on Apache Kafka Distributed Publish/Subscribe messaging Hadoop and Map Reduce – Data storage and analysis, Design and concepts of HDFS, YARN, Map Reduce workflows and Features, Setting up a Hadoop cluster.

Objectives and Outcomes

Preamble

The improvements in Database Management System (DBMS) technology have resulted in significant developments in distributed computing and parallel processing technologies. This has led to the development of distributed database management systems and parallel database management systems that are now the dominant data management tools for highly data-intensive applications. In addition to an introduction to parallel and distributed database architectures and their implementation features, this course covers advanced query processing and optimization approaches for parallel and distributed systems. The students also gain knowledge in setting up a distributed database application using the latest technologies.

Course Objectives

  • To provide an understanding of the distributed and parallel database architectures so as to make a choice while implementing a distributed application;
  • To learn how a distributed database can be implemented for an application;
  • To get trained in distributed query processing and optimization for various distributed or parallel database applications

 

Course Outcomes

COs

Description

CO1

Understand the need for different distributed and parallel database architectures and study its characteristics.

CO2

Design algorithms for distributed and parallel data processing.

CO3

Understand the concepts of fragmentation and allocation algorithms.

CO4

Implement optimized parallel and distributed queries for such a system.

CO5

Design and build an application using one of the latest distributed or parallel database technology.

 

Prerequisites

  • DBMS
  • Algorithms and Data Structures
  • Advanced Java, Apache Spark

CO-PO Mapping

 

COs

Description

PO1

PO2

PO3

PO4

PO5

CO1

Understand the need for different distributed and parallel database architectures and study its characteristics.

3

CO2

Design algorithms for distributed and parallel data processing.

3

2

2

2

CO3

Understand the concepts of fragmentation and allocation algorithms.

3

2

CO4

Implement optimized parallel and distributed queries for such a system.

3

2

CO5

Design and build an application using one of the latest distributed or parallel database technology.

3

3

3

3

3

Evaluation Pattern

Evaluation Pattern – 70:30

 

  • Midterm Exam – 30%
  • Continuous Evaluation – 40%
  • End Semester Exam – 30%

Text Books / References

Text Book / References

 

  1. M. Tamer Özsu and Patrick Valduriez, “Principles of Distributed Database Systems”, 4th Edition, 2020, Springer
  2. Dimitri P. Bertsekas and John N. Tsitsiklis, ”Parallel and distributed computation : Numerical methods”, 3rd Edition, 2020
  3. Andrew S. Tannenbaum and Maarten van Steen ”Distributed Systems: Principles and Paradigms”, Third Edition, Prentice Hall, October 2017.
  4. Ajay D. Kshemkalyani and Mukesh Singhal, ”Distributed Computing: Principles, Algorithms and Systems”, Cambridge University Press, 2011.
  5. Vijay K. Garg, ”Elements of Distributed Computing”, Wiley-IEEE Press, May 2002
  6. David DeWitt and Jim Gray, ”Parallel database systems: The future of high performance database systems”, CACM, 1992
  7. Tom White, ”Hadoop-The Definitive Guide”, 4th ed., O’Reilly, 2015

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now