PROFESSIONAL ELECTIVES
Electives in Artificial Intelligence
Course Name | Reinforcement Learning |
Course Code | 23CSE477 |
Program | B. Tech. in Computer Science and Engineering (CSE) |
Credits | 3 |
Campus | Amritapuri ,Coimbatore,Bengaluru, Amaravati, Chennai |
Electives in Artificial Intelligence
Introduction to Reinforcement learning, Markov Decision Process (MDP) – Markov Process, Markov Reward Process, Markov Decision Process and Bellman Equations, Partially Observable MDPs, Planning by Dynamic programming (DP) – Policy Evaluation, Value Iteration, Policy Iteration, DP Extensions, model-free prediction and control.
Integrating planning with learning – Model-based RL, Integrated Architecture and Simulation-based Search, Monte-Carlo (MC) Learning, Exploration and exploitation – Multi-arm Bandits, Contextual Bandits and MDP Extensions, integrating AI search and learning – Classical Games: Combining Minimax Search and RL.
Hierarchical RL – Semi-Markov Decision Process, Learning with Options, Deep RL – Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Double Q-Learning, Multi-agent RL – Cooperative vs. Competitive Settings, Mixed Setting.
Course Objectives
Course Outcomes
CO1: Understand Markov decision process and reinforcement learning.
CO2: Apply AI search, planning, and learning.
CO3: Apply Hierarchical learning techniques.
CO4: Analyze Q-learning and multi-agent systems.
CO-PO Mapping
PO/PSO | PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | PO8 | PO9 | PO10 | PO11 | PO12 | PSO1 | PSO2 |
CO | ||||||||||||||
CO1 | 3 | 2 | 3 | 3 | 2 | 0 | 0 | 2 | 2 | 2 | 0 | 0 | 3 | 3 |
CO2 | 3 | 2 | 3 | 3 | 3 | 0 | 0 | 2 | 2 | 2 | 0 | 0 | 3 | 3 |
CO3 | 3 | 2 | 3 | 3 | 3 | 0 | 0 | 2 | 2 | 2 | 0 | 0 | 3 | 3 |
CO4 | 3 | 2 | 3 | 3 | 3 | 0 | 0 | 2 | 2 | 2 | 0 | 0 | 3 | 3 |
Evaluation Pattern: 70:30
Assessment | Internal | End Semester |
Midterm | 20 | |
Continuous Assessment – Theory (*CAT) | 10 | |
Continuous Assessment – Lab (*CAL) | 40 | |
**End Semester | 30 (50 Marks; 2 hours exam) |
*CAT – Can be Quizzes, Assignments, and Reports
*CAL – Can be Lab Assessments, Project, and Report
**End Semester can be theory examination/ lab-based examination/ project presentation
Textbook(s)
Richard S. Sutton and Andrew G. Barto; “Reinforcement Learning: An Introduction”; 2nd Edition, MIT Press, 2018.
Reference(s)
Dimitri P. Bertsekas; “Reinforcement Learning and Optimal Control”; 1st Edition, Athena Scientific, 2019.
Dimitri P. Bertsekas; “Dynamic Programming and Optimal Control (Vol. I and Vol. II)”; 4th Edition, Athena Scientific, 2017.
Csaba Szepesvári; “Algorithms of Reinforcement Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning)”, Morgan & Claypool Publishers, 2010.
DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.