![]() |
machine learning with experts and bandits
4 crediti (20 ore), Dottorato in Informatica DOCENTE: Nicolò Cesa-Bianchi |
![]() |
machine learning with experts and bandits
4 crediti (20 ore), Dottorato in Informatica DOCENTE: Nicolò Cesa-Bianchi |
Abstract
The course is about topics at the interface between machine learning and game theory. This area, also known as online learning, is concerned with the design of efficient algorithms for big data using local optimization techniques. We will start by introducing the Hedge algorithm for solving a class of decision problems known as prediction with expert advice. Hedge will be analyzed in the wider setting of sequential decision problems with partial feedback (multiarmed bandits). We will describe two application of multiarmed bandits: online auctions and recommender systems. In the second part of the course we will show that Hedge is a special case of a larger family of algorithms for online optimization known as Mirror Descent, whose analysis will be carried out using convex analysis tools. Finally, we will show how certain instances of Mirror Descent can learn data sources that change over time in an arbitrary fashion.
Syllabus
References
Exam
In order to pass the exam you have to write a research note summarizing on one or more papers on a topic previously agreed with the lecturer. Your note must include: