Degree programme
Programme Structure
Show/Search Programme
Degree Programme
International context
Customized Schedule
Your customized time schedule has been disabled
Search a Professor
Professor's activities
Search a Course
Search a Course (system prior D.M. n. 509)
Search Lessons taught in English
Information on didactic, research and institutional assignments on this page are certified by the University; more information, prepared by the professor, are available on the personal web page and in the curriculum vitae indicated on this webpage.
Information on professor
ProfessorRestelli Marcello
QualificationAssociate professor full time
Belonging DepartmentDipartimento di Elettronica, Informazione e Bioingegneria
Scientific-Disciplinary SectorING-INF/05 - Information Processing Systems
Curriculum VitaeDownload CV (357.26Kb - 02/12/2019)

Professor's office hours
DEI----WednesdayFrom 11:00
To 13:00
4015--Si consiglia di prendere appuntamento via email con il docente
Professor's personal websitehttp://home.deib.polimi.it/restelli/

Data source: RE.PUBLIC@POLIMI - Research Publications at Politecnico di Milano

List of publications and reserach products for the year 2022
No product yet registered in the year 2022

List of publications and reserach products for the year 2021 (Show all details | Hide all details)
Type Title of the Publicaiton/Product
Abstract in Rivista
Abstract PO-065: Artificial intelligence to improve selection for NSCLC patients treated with immunotherapy (Show >>)
Conference proceedings
Conservative Online Convex Optimization (Show >>)
Inferring Functional Properties from Fluid Dynamics Features (Show >>)
Journal Articles
Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems (Show >>)
MushroomRL: Simplifying Reinforcement Learning Research (Show >>)
Policy space identification in configurable environments (Show >>)
Quantum compiling by deep reinforcement learning (Show >>)
Safe policy iteration: A monotonically improving approximate policy iteration approach (Show >>)

List of publications and reserach products for the year 2020 (Show all details | Hide all details)
Type Title of the Publicaiton/Product
Conference proceedings
A Data-Based Approach for the Prediction of Stuck-Pipe Events in Oil Drilling Operations (Show >>)
A Novel Confidence-Based Algorithm for Structured Bandits (Show >>)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits (Show >>)
An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies (Show >>)
Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration (Show >>)
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning (Show >>)
Dealing with Transaction Costs in Portfolio Optimization: Online Gradient Descent with Momentum (Show >>)
Driving exploration by maximum distribution in gaussian process bandits (Show >>)
Gradient-Aware Model-Based Policy Search (Show >>)
Inverse Reinforcement Learning from a Gradient-based Learner (Show >>)
Model-Free Non-Stationarity Detection and Adaptation in Reinforcement Learning (Show >>)
Option Hedging with Risk Averse Reinforcement Learning (Show >>)
Risk-Averse Trust Region Optimization for Reward-Volatility Reduction (Show >>)
Sequential Transfer in Reinforcement Learning with a Generative Model (Show >>)
Sharing Knowledge in Multi-Task Deep Reinforcement Learning (Show >>)
Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions (Show >>)
Journal Articles
Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving (Show >>)
Importance Sampling Techniques for Policy Optimization (Show >>)
On the use of the policy gradient and Hessian in inverse reinforcement learning (Show >>)
Sliding-Window Thompson Sampling for Non-Stationary Settings (Show >>)

List of publications and reserach products for the year 2019 (Show all details | Hide all details)
Type Title of the Publicaiton/Product
Conference proceedings
Dealing with interdependencies and uncertainty in multi-channel advertising campaigns optimization (Show >>)
Exploiting Action-Value Uncertainty to Drive Exploration in Reinforcement Learning (Show >>)
Exploration Driven by an Optimistic Bellman Equation (Show >>)
Feature Selection via Mutual Information: New Theoretical Insights (Show >>)
IDIL: Exploiting Interdependence to Optimize Multi-Channel Advertising Campaigns (Show >>)
Optimistic Policy Optimization via Multiple Importance Sampling (Show >>)
Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters (Show >>)
Reinforcement Learning Based Control of Coherent Transport by Adiabatic Passage of Spin Qubits (Show >>)
Reinforcement Learning in Configurable Continuous Environments (Show >>)
Transfer of Samples in Policy Search via Multiple Importance Sampling (Show >>)
Journal Articles
Coherent transport of quantum states by deep reinforcement learning (Show >>)

List of publications and reserach products for the year 2018 (Show all details | Hide all details)
Type Title of the Publicaiton/Product
Conference proceedings
A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns (Show >>)
An upper limb Functional Electrical Stimulation controller based on Reinforcement Learning: A feasibility case study. (Show >>)
Configurable Markov Decision Processes (Show >>)
Does Reinforcement Learning outperform PID in the control of FES-induced elbow flex-extension? (Show >>)
Importance Weighted Transfer of Samples in Reinforcement Learning (Show >>)
Improving Multi-Armed Bandit Algorithms for Pricing (Show >>)
Online Follower's Behaviour Identification in Leadership Games (Show >>)
Online Joint Bid/Budget Optimization of Pay-per-click Advertising Campaigns (Show >>)
Policy optimization via importance sampling (Show >>)
Reinforcement Learning Control of Functional Electrical Stimulation of the upper limb: a feasibility study. (Show >>)
Stochastic Variance-Reduced Policy Gradient (Show >>)
Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback (Show >>)
Transfer of Value Functions via Variational Methods (Show >>)
When Gaussian Processes Meet Combinatorial Bandits: GCB (Show >>)
Journal Articles
Improving multi-armed bandit algorithms in online pricing settings (Show >>)
manifesti v. 3.4.19 / 3.4.19
Area Servizi ICT