Benjamin Van Roy

Professor of Electrical Engineering, of Management Science and Engineering and, by courtesy, of Computer Science

Bio

Benjamin Van Roy is a Professor at Stanford University, where he has served on the faculty since 1998. His current research focuses on reinforcement learning. Beyond academia, he leads a DeepMind Research team in Mountain View, and has also led research programs at Unica (acquired by IBM), Enuvis (acquired by SiRF), and Morgan Stanley.

He is a Fellow of INFORMS and IEEE and has served on the editorial boards of Machine Learning, Mathematics of Operations Research, for which he co-edited the Learning Theory Area, Operations Research, for which he edited the Financial Engineering Area, and the INFORMS Journal on Optimization. He received the SB in Computer Science and Engineering and the SM and PhD in Electrical Engineering and Computer Science, all from MIT, where his doctoral research was advised by John N. Tstitsiklis. He has been a recipient of the MIT George C. Newton Undergraduate Laboratory Project Award, the MIT Morris J. Levin Memorial Master's Thesis Award, the MIT George M. Sprowls Doctoral Dissertation Award, the National Science Foundation CAREER Award, the Stanford Tau Beta Pi Award for Excellence in Undergraduate Teaching, the Management Science and Engineering Department's Graduate Teaching Award, and the Lanchester Prize. He was the plenary speaker at the 2019 Allerton Conference on Communications, Control, and Computing. He has held visiting positions as the Wolfgang and Helga Gaul Visiting Professor at the University of Karlsruhe, the Chin Sophonpanich Foundation Professor and the InTouch Professor at Chulalongkorn University, a Visiting Professor at the National University of Singapore, and a Visiting Professor at the Chinese University of Hong Kong, Shenzhen.

Academic Appointments

Professor, Electrical Engineering
Professor, Management Science and Engineering
Professor (By courtesy), Computer Science
Member, Bio-X
Member, Institute for Computational and Mathematical Engineering (ICME)

Honors & Awards

Fellow, INFORMS (2015)
Fellow, IEEE (2019)
Lanchester Prize, INFORMS (2022)

Professional Education

BS, Massachusetts Institute of Technology, Computer Science and Engineering (1993)
MS, Massachusetts Institute of Technology, Electrical Engineering and Computer Science (1995)
PhD, Massachusetts Institute of Technology, Electrical Engineering and Computer Science (1998)

2025-26 Courses

Aligning Superintelligence
CS 338, MS&E 338 (Spr)
Markov Decision Processes
EE 283, MS&E 235A (Aut)
Reinforcement Learning: Behaviors and Applications
EE 383, MS&E 235B (Win)
Independent Studies (20)
- Advanced Reading and Research
  CS 499 (Aut, Win, Spr, Sum)
- Advanced Reading and Research
  CS 499P (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390A (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390B (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390C (Aut, Win, Spr, Sum)
- Directed Reading and Research
  MS&E 408 (Aut, Win, Spr, Sum)
- Independent Project
  CS 399 (Aut, Win, Spr, Sum)
- Independent Project
  CS 399P (Aut, Win, Spr, Sum)
- Independent Work
  CS 199 (Aut, Win, Spr, Sum)
- Independent Work
  CS 199P (Aut, Win, Spr, Sum)
- Master's Thesis and Thesis Research
  EE 300 (Aut, Win, Spr, Sum)
- Part-time Curricular Practical Training
  CS 390D (Aut, Win, Spr, Sum)
- Programming Service Project
  CS 192 (Aut, Win, Spr, Sum)
- Senior Project
  CS 191 (Aut, Win, Spr)
- Special Studies and Reports in Electrical Engineering
  EE 191 (Aut, Win, Spr, Sum)
- Special Studies and Reports in Electrical Engineering
  EE 391 (Aut, Win, Spr, Sum)
- Special Studies and Reports in Electrical Engineering (WIM)
  EE 191W (Aut, Win, Spr, Sum)
- Special Studies or Projects in Electrical Engineering
  EE 190 (Aut, Win, Spr, Sum)
- Special Studies or Projects in Electrical Engineering
  EE 390 (Aut, Win, Spr, Sum)
- Writing Intensive Senior Research Project
  CS 191W (Aut, Win, Spr)
Prior Year Courses
2023-24 Courses
- Aligning Superintelligence
  MS&E 338 (Spr)
- Bandit Learning: Behaviors and Applications
  EE 277, MS&E 237A (Aut)
- Reinforcement Learning: Behaviors and Applications
  EE 370, MS&E 237B (Win)
2022-23 Courses
- Reinforcement Learning: Behaviors and Applications
  EE 277, MS&E 237 (Aut)
- Reinforcement Learning: Frontiers
  MS&E 338 (Spr)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Mahdi Al-Husseini, Bella Hofflich, Samuel Liu, William Overman, Sahasrajit Sarmasarkar, Keertana Veeramony Chidambaram
Postdoctoral Faculty Sponsor
Alex Infanger
Doctoral Dissertation Advisor (AC)
Henrik Marklund, Wanqiao Xu, Yifan Zhu
Master's Program Advisor
Mahmood Alhusseini, Tyler Huang, Arya Marwaha, Roya Meykadeh, Xi Wang, Shatong Zhu
Doctoral (Program)
Semyon Lomasov, Henrik Marklund, Yifan Zhu

All Publications

Deciding What to Learn: A Rate-Distortion Approach Arumugam, D., Van Roy, B. edited by Meila, M., Zhang, T. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2021

View details for Web of Science ID 000683104600035
Deep Exploration via Randomized Value Functions JOURNAL OF MACHINE LEARNING RESEARCH Osband, I., Van Roy, B., Russo, D. J., Wen, Z. 2019; 20

View details for Web of Science ID 000487068900008
Information-Theoretic Confidence Bounds for Reinforcement Learning Lu, X., Van Roy, B. edited by Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2019

View details for Web of Science ID 000534424302046
A Tutorial on Thompson Sampling FOUNDATIONS AND TRENDS IN MACHINE LEARNING Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z. 2018; 11 (1): 1–96

View details for DOI 10.1561/2200000070

View details for Web of Science ID 000438444300001
Scalable Coordinated Exploration in Concurrent Reinforcement Learning Dimakopoulou, M., Osband, I., Van Roy, B. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018

View details for Web of Science ID 000461823304025
An Information-Theoretic Analysis for Thompson Sampling with Many Actions Dong, S., Van Roy, B. edited by Bengio, S., Wallach, H., Larochelle, H., Grauman, K., CesaBianchi, N., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2018

View details for Web of Science ID 000461823304019
Learning to Optimize via Information-Directed Sampling OPERATIONS RESEARCH Russo, D., Van Roy, B. 2018; 66 (1): 230–52

View details for DOI 10.1287/opre.2017.1663

View details for Web of Science ID 000426081800015
Conservative Contextual Linear Bandits Kazerouni, A., Ghavamzadeh, M., Abbasi-Yadkori, Y., Van Roy, B. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017

View details for Web of Science ID 000452649403094
Ensemble Sampling Lu, X., Van Roy, B. edited by Guyon, Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. NEURAL INFORMATION PROCESSING SYSTEMS (NIPS). 2017

View details for Web of Science ID 000452649403032
An Information-Theoretic Analysis of Thompson Sampling JOURNAL OF MACHINE LEARNING RESEARCH Russo, D., Van Roy, B. 2016; 17

View details for Web of Science ID 000391522800001