News
- Course page on Ariel
- The RL course will be taught in the period January 15 - March 19, 2024. Classes will be held in Via Celoria on Monday and Tuesday from 16:30 to 18:30.
Goals
This course introduces the theoretical and algorithmic foundations of Reinforcement Learning, the subfield of Machine Learning studying adaptive agents that take actions and interact with an unknown environment. Reinforcement learning is a powerful paradigm for the study of autonomous AI systems, and has been applied to a wide range of tasks including autonomous driving, industrial automation, conversational agents (including those based on large language models), trading and finance, game playing, and healthcare.
Syllabus
- Introduction (version Jan 20, 2023)
- What is reinforcement learning
- Markov decision processes
- Evaluation criteria: finite horizon, infinite horizon, discounted horizon
- Markov policies and their properties
- Finite horizon (version Jan 19, 2023)
- State-value function
- Action-value function
- Bellman optimality equations for finite horizon
- Discounted horizon (version Jan 20, 2023)
- Bellman optimality equations for discounted horizon
- Value iteration
- Policy iteration
- Linear programming interpretation
- Model-based reinforcement learning
- Model-free reinforcement learning (version Feb 4, 2023)
- Q-learning
- SARSA
- Temporal difference algorithms (version Feb 9, 2023)
- TD(0)
- TD(λ)
- Equivalence between forward and backward view
- Value Function Approximation
- Linear Value Function Approximation
- Monte Carlo Value Function Approximation
- TD Learning with Value Function Approximation
- Value Function Approximation for Policy Evaluation
- Control using Value Function Approximation
- Action-Value Function Approximation
- Non-Linear and Deep Neural Network Approximation
- Model-Free Control with General Function Approximation
- Q-Learning with Value Function Approximation
- Policy Gradient
- Policy Gradient Theorem
- Off-Policy Policy Gradients
- Monte-Carlo Policy Gradient (REINFORCE)
- Actor-critic algorithms
- Deep Q-learning algorithm (DQN)
- Case Study: RL in Classic Games
- Formalize Word Problem as MDP
- Choice of the Algorithms
- Problem KPIs
- Coding and implementation
Reference material
- Lecture notes (linked to the syllabus).
- Additional material (Prof. Ferrara)
- Suggested reading: Shie Mannor, Yishay Mansour, and Aviv Tamar. RL: Foundations (in progress).
Exam
The exam consists in developing an experimental project and writing a report which will be discussed in the oral exam. The discussion will also include questions on the theory covered in the course. The final grade will take into account both the project and the oral exam.
Course calendar:
Browse the calendar pages to find out what was covered in each class.