Skip to content. Skip to navigation
CIM Menus
 

Deep Exploration via Randomized Value Functions


Benjamin Van Roy
Stanford University

January 26, 2018 at  2:30 PM
McConnell Engineering Room 103

An important challenge in reinforcement learning concerns how an agent can simultaneously explore and generalize in a reliably efficient manner. It is difficult to claim that one can produce a robust artificial intelligence without tackling this fundamental issue. This talk will present a systematic approach to exploration that induces judicious probing through randomization of value function estimates and operates effectively in tandem with common reinforcement learning algorithms, such as least-squares value iteration and temporal-difference learning, that generalize via parameterized representations of the value function. Theoretical results offer assurances with tabular representations of the value function, and computational results suggest that the approach remains effective with generalizing representations.