Markov Decision Processes

Optimistic PAC Reinforcement Learning: the Instance-Dependent View

Optimistic algorithms have been extensively studied for regret minimization in episodic tabular Markov Decision Processes (MDPs), both …

Andrea Tirinzoni, Aymen Al Marjani, Emilie Kaufmann

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

In probably approximately correct (PAC) reinforcement learning (RL), an agent is required to identify an $\epsilon$-optimal policy with …

Andrea Tirinzoni, Aymen Al Marjani, Emilie Kaufmann

Navigating to the Best Policy in Markov Decision Processes

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions …

Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

We investigate the problem of best-policy identification in discounted Markov Decision Processes (MDPs) when the learner has access to …

Aymen Al Marjani, Alexandre Proutiere