Kunle Olukotun

Cadence Design Systems Professor, Professor of Electrical Engineering and of Computer Science

Bio

Kunle Olukotun is the Cadence Design Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun is a pioneer in multicore processor design and the leader of the Stanford Hydra chip multiprocessor (CMP) research project. He founded Afara Websystems to develop high-throughput, low-power multicore processors for server systems. The Afara multi-core processor, called Niagara, was acquired by Sun Microsystems and now powers Oracle's SPARC-based servers. In 2017, Olukotun co-founded SambaNova Systems, a Machine Learning and Artificial Intelligence company, and continues to lead as their Chief Technologist.

Olukotun is the Director of the Pervasive Parallel Lab and a member of the Data Analytics tor What's Next (DAWN) Lab, developing infrastructure for usable machine learning. He is a member of the National Academy of Engineering, an ACM Fellow, and an IEEE Fellow for contributions to multiprocessors on a chip design and the commercialization of this technology. He also received the Harry H. Goode Memorial Award.

Olukotun received his Ph.D. in Computer Engineering from The University of Michigan.

Academic Appointments

Professor, Electrical Engineering
Professor, Computer Science
Faculty Affiliate, Institute for Human-Centered Artificial Intelligence (HAI)
Member, Wu Tsai Neurosciences Institute

Honors & Awards

Eckert-Machly Award, ACM-IEEE (2023)
Member, American Academy of Arts and Sciences (2022)
Member, National Academy of Engineering (2021)
Harry H. Goode Memorial Award, IEEE (2018)
Fellow, ACM (2007)
Fellow, IEEE (2007)

Professional Education

PhD, Michigan (1991)

Contact

Academic
kunle@stanford.edu
University - Faculty Department: Electrical Engineering Position: Professor
- GATES BLDG 3A-302
- Stanford, California 94305-9030
- (650) 725-3713 (office)
University - Faculty Department: Computer Science Position: Professor
- (650) 725-3713 (office)

Alternate Contact Angelica Teaupa avteaupa@stanford.edu

Additional Info

Mail Code: 9030
ORCID:
https://orcid.org/0000-0002-8779-0636

2025-26 Courses

Digital Systems Design Lab
EE 109 (Spr)
Hardware Accelerators for Machine Learning
CS 217 (Win)
Parallel Computing
CS 149 (Aut)
Independent Studies (20)
- Advanced Reading and Research
  CS 499 (Aut, Win, Spr, Sum)
- Advanced Reading and Research
  CS 499P (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390A (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390B (Aut, Win, Spr, Sum)
- Curricular Practical Training
  CS 390C (Aut, Win, Spr, Sum)
- Independent Project
  CS 399 (Aut, Win, Spr, Sum)
- Independent Project
  CS 399P (Aut, Win, Spr, Sum)
- Independent Work
  CS 199 (Aut, Win, Spr, Sum)
- Independent Work
  CS 199P (Aut, Win, Spr, Sum)
- Master's Thesis and Thesis Research
  EE 300 (Aut, Win, Spr, Sum)
- Part-time Curricular Practical Training
  CS 390D (Aut, Win, Spr, Sum)
- Programming Service Project
  CS 192 (Aut, Win, Spr, Sum)
- Senior Project
  CS 191 (Aut, Win, Spr)
- Special Studies and Reports in Electrical Engineering
  EE 191 (Aut, Win, Spr, Sum)
- Special Studies and Reports in Electrical Engineering
  EE 391 (Aut, Win, Spr, Sum)
- Special Studies and Reports in Electrical Engineering (WIM)
  EE 191W (Aut, Win, Spr, Sum)
- Special Studies or Projects in Electrical Engineering
  EE 190 (Aut, Win, Spr, Sum)
- Special Studies or Projects in Electrical Engineering
  EE 390 (Aut, Win, Spr, Sum)
- Supervised Undergraduate Research
  CS 195 (Spr)
- Writing Intensive Senior Research Project
  CS 191W (Aut, Win, Spr)
Prior Year Courses
2024-25 Courses
- Digital Systems Design Lab
  EE 109 (Spr)
- Parallel Computing
  CS 149 (Aut)
2023-24 Courses
- Digital Systems Design Lab
  EE 109 (Spr)
2022-23 Courses
- Digital Systems Design Lab
  EE 109 (Spr)
- Hardware Accelerators for Machine Learning
  CS 217 (Win)
- Parallel Computing
  CS 149 (Aut)

Stanford Advisees

Doctoral Dissertation Reader (AC)
Avanika Narayan, Rohan Yadav
Postdoctoral Faculty Sponsor
Olivia Hsu
Doctoral Dissertation Advisor (AC)
Gina Sohn, Qizheng Zhang
Master's Program Advisor
Yamilett Estrada-Reyes, Berk Gokmen, Kristen Guernsey, Suchen He, Quan Ho, Cici Hou, Jinhyo Huh, Mohamed Ismail, Eric Li, Rupert Lu, Anastasiya Masalava, Scott Milner, Konstantin Papkovskiy, Janhavi Purkar, Joseph Rejive, Daniel Song, Egan Tardif, Ronan Wallace, Miaoya Zhong
Doctoral Dissertation Co-Advisor (AC)
Rubens Lacouture
Doctoral (Program)
Konstantin Hossfeld, Wonsuk Jang, Jungwoo Kim, Taeyoung Kong, Rubens Lacouture, Louis Le Coeur, Leo Liu, Nathan Sobotka, Gina Sohn, Genghan Zhang, Qizheng Zhang

All Publications

Adaptive Self-improvement LLM Agentic System for ML Library Development Zhang, G., Liang, W., Hsu, O., Olukotun, K. edited by Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2025: 75427-75452

View details for Web of Science ID 001693172600012
SSM-RDU: A Reconfigurable Dataflow Unit for Long-Sequence State-Space Models Kos, S., Olukotun, K., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2025: 626-629

View details for DOI 10.1109/ICCD65941.2025.00095

View details for Web of Science ID 001684940200088
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits Zhou, Z., Zhang, Q., Kumbong, H., Olukotun, K. edited by Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J. JMLR-JOURNAL MACHINE LEARNING RESEARCH. 2025: 79570-79594

View details for Web of Science ID 001693172600187
Revet: A Language and Compiler for Dataflow Threads Rucker, A. C., Sundram, S., Smith, C., Vilim, M., Prabhakar, R., Kjolstad, F., Olukotun, K., IEEE Comp Soc IEEE COMPUTER SOC. 2024: 61-74

View details for DOI 10.1109/HPCA57654.2024.00016

View details for Web of Science ID 001207751400005
The Dataflow Abstract Machine Simulator Framework Zhang, N., Lacouture, R., Sohn, G., Mure, P., Zhang, Q., Kjolstad, F., Olukotun, K., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2024: 532-547

View details for DOI 10.1109/ISCA59077.2024.00046

View details for Web of Science ID 001290320700036
CARAVAN: Practical Online Learning of In-Network ML Models with Labeling Agents Zhang, Q., Imran, A., Bardhi, E., Swamy, T., Zhang, N., Shahbaz, M., Olukotun, K., ASSOC COMPUTING MACHINERY ASSOC COMPUTING MACHINERY. 2024: 17-20

View details for DOI 10.1145/3704742.3704964

View details for Web of Science ID 001414855500004
Computing Systems in the Foundation Model Era Olukotun, K., IEEE COMPUTER SOC IEEE COMPUTER SOC. 2024: 889

View details for DOI 10.1109/IPDPS57955.2024.00083

View details for Web of Science ID 001270389600047
CARAVAN: Practical Online Learning of In-Network ML Models with Labeling Agents Zhang, Q., Imran, A., Bardhi, E., Swamy, T., Zhang, N., Shahbaz, M., Olukotun, K., USENIX USENIX ASSOC. 2024: 325-345

View details for Web of Science ID 001270877200018
Mosaic: An Interoperable Compiler for Tensor Algebra PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL Bansal, M., Hsu, O., Olukotun, K., Kjolstad, F. 2023; 7 (PLDI)

View details for DOI 10.1145/3591236

View details for Web of Science ID 001005701900018
BaCO: A Fast and Portable Bayesian Compiler Optimization Framework Hellsten, E., Souza, A., Lenfers, J., Lacouture, R., Hsu, O., Ejjeh, A., Kjolstad, F., Steuwer, M., Olukotun, K., Nardi, L. edited by Aamodt, T. M., Jerger, N. E., Swift, M. ASSOC COMPUTING MACHINERY. 2023: 19-42

View details for DOI 10.1145/3623278.3624770

View details for Web of Science ID 001161547900002
Sigma: Compiling Einstein Summations to Locality-Aware Dataflow Zhao, T., Rucker, A., Olukotun, K. edited by Aamodt, T. M., Jerger, N. E., Swift, M. ASSOC COMPUTING MACHINERY. 2023: 718-732

View details for DOI 10.1145/3575693.3575694

View details for Web of Science ID 001074472300050
Global Perspectives of Diversity, Equity, and Inclusion COMMUNICATIONS OF THE ACM Barroso, L., Choudhury, T., Gupta, M., Olukotun, O., Popa, R., Song, D., Patterson, D. A. 2022; 65 (12): 30-31

View details for DOI 10.1145/3548454

View details for Web of Science ID 000887945400010
Taurus: A Data Plane Architecture for Per-Packet ML Swamy, T., Rucker, A., Shahbaz, M., Gaur, I., Olukotun, K. edited by Falsafi, B., Ferdman, M., Lu, S., Weinisch, T. ASSOC COMPUTING MACHINERY. 2022: 1099-1114

View details for DOI 10.1145/3503222.3507726

View details for Web of Science ID 000810486300077
Accelerating SLIDE: Exploiting Sparsity on Accelerator Architectures Ko, S., Rucker, A., Zhang, Y., Mure, P., Olukotun, K., IEEE Comp Soc IEEE COMPUTER SOC. 2022: 663-670

View details for DOI 10.1109/IPDPSW55747.2022.00116

View details for Web of Science ID 000855041000083
Compilation of Sparse Array Programming Models PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL Henry, R., Hsu, O., Yadav, R., Chou, S., Olukotun, K., Amarasinghe, S., Kjolstad, F. 2021; 5

View details for DOI 10.1145/3485505

View details for Web of Science ID 000731569200032
Chopping off the Tail: Bounded Non-Determinism for Real-Time Accelerators IEEE COMPUTER ARCHITECTURE LETTERS Rucker, A., Shahbaz, M., Olukotun, K. 2021; 20 (2): 110-113

View details for DOI 10.1109/LCA.2021.3102224

View details for Web of Science ID 000685885600002
Aurochs: An Architecture for Dataflow Threads Vilim, M., Rucker, A., Olukotun, K., IEEE Comp Soc IEEE COMPUTER SOC. 2021: 402-415

View details for DOI 10.1109/ISCA52012.2021.00039

View details for Web of Science ID 000702275600030
Bayesian Optimization with a Prior for the Optimum Souza, A., Nardi, L., Oliveira, L. B., Olukotun, K., Lindauer, M., Hutter, F. edited by Oliver, N., PerezCruz, F., Kramer, S., Read, J., Lozano, J. A. SPRINGER INTERNATIONAL PUBLISHING AG. 2021: 265-296

View details for DOI 10.1007/978-3-030-86523-8_17

View details for Web of Science ID 000713413200017
High Performance Lattice Regression on FPGAs via a High Level Hardware Description Language Zhang, N., Feldman, M., Olukotun, K., IEEE IEEE. 2021: 78-87

View details for DOI 10.1109/ICFPT52863.2021.9609893

View details for Web of Science ID 000792703100011
SARA: Scaling a Reconfigurable Dataflow Accelerator Zhang, Y., Zhang, N., Zhao, T., Vilim, M., Shahbaz, M., Olukotun, K., IEEE Comp Soc IEEE COMPUTER SOC. 2021: 1041-1054

View details for DOI 10.1109/ISCA52012.2021.00085

View details for Web of Science ID 000702275600076
Elastic RSS: Co-Scheduling Packets and Cores Using Programmable NICs Rucker, A., Swamy, T., Shahbaz, M., Olukotun, K., ACM ASSOC COMPUTING MACHINERY. 2019: 71–77

View details for DOI 10.1145/3343180.3343184

View details for Web of Science ID 000505066500011
Scalable Interconnects for Reconfigurable Spatial Architectures Zhang, Y., Rucker, A., Vilim, M., Prabhakar, R., Hwang, W., Olukotun, K., ACM ASSOC COMPUTING MACHINERY. 2019: 615–28

View details for DOI 10.1145/3307650.3322249

View details for Web of Science ID 000521059600048
TensorFlow to Cloud FPGAs: Tradeoffs for Accelerating Deep Neural Networks Hadjis, S., Olukotun, K. edited by Sourdis, Bouganis, C. S., Alvarez, C., Toledo, L., Valero, P., Martorell IEEE. 2019: 360–66

View details for DOI 10.1109/FPL.2019.00064

View details for Web of Science ID 000518670300054
Polystore plus plus : Accelerated Polystore System for Heterogeneous Workloads Singhal, R., Zhang, N., Nardi, L., Shahbaz, M., Olukotun, K., IEEE Comp Soc IEEE COMPUTER SOC. 2019: 1641–51

View details for DOI 10.1109/ICDCS.2019.00163

View details for Web of Science ID 000565234200152
Exploring the Utility of Developer Exhaust. Proceedings of the Second Workshop on Data Management for End-to-End Machine Learning. Workshop on Data Management for End-to-End Machine Learning (2nd : 2018 : Houston, Tex.) Zhang, J., Lam, M., Wang, S., Varma, P., Nardi, L., Olukotun, K., Re, C. 2018; 2018

Abstract

Using machine learning to analyze data often results in developer exhaust - code, logs, or metadata that do not define the learning algorithm but are byproducts of the data analytics pipeline. We study how the rich information present in developer exhaust can be used to approximately solve otherwise complex tasks. Specifically, we focus on using log data associated with training deep learning models to perform model search by predicting performance metrics for untrained models. Instead of designing a different model for each performance metric, we present two preliminary methods that rely only on information present in logs to predict these characteristics for different architectures. We introduce (i) a nearest neighbor approach with a hand-crafted edit distance metric to compare model architectures and (ii) a more generalizable, end-to-end approach that trains an LSTM using model architectures and associated logs to predict performance metrics of interest. We perform model search optimizing for best validation accuracy, degree of overfitting, and best validation accuracy given a constraint on training time. Our approaches can predict validation accuracy within 1.37% error on average, while the baseline achieves 4.13% by using the performance of a trained model with the closest number of layers. When choosing the best performing model given constraints on training time, our approaches select the top-3 models that overlap with the true top- 3 models 82% of the time, while the baseline only achieves this 54% of the time. Our preliminary experiments hold promise for how developer exhaust can help learn models that can approximate various complex tasks efficiently.

View details for DOI 10.1145/3209889.3209895

View details for PubMedID 31131381
Plasticine: A Reconfigurable Accelerator for Parallel Patterns IEEE MICRO Prabhakar, R., Zhang, Y., Koeplinger, D., Feldman, M., Zhao, T., Hadjis, S., Pedram, A., Kozyrakis, C., Olukotun, K. 2018; 38 (3): 20–31

View details for Web of Science ID 000432316500004
LevelHeaded: A Unified Engine for Business Intelligence and Linear Algebra Querying Aberger, C. R., Lamb, A., Olukotun, K., Re, C., IEEE IEEE. 2018: 449–60

View details for DOI 10.1109/ICDE.2018.00048

View details for Web of Science ID 000492836500040
EmptyHeaded: A Relational Engine for Graph Processing Aberger, C. R., Lamb, A., Tu, S., Noetzli, A., Olukotun, K., Re, C. ASSOC COMPUTING MACHINERY. 2017

View details for DOI 10.1145/3129246

View details for Web of Science ID 000419302700001
Mind the Gap: Bridging Multi-Domain Query Workloads with EmptyHeaded PROCEEDINGS OF THE VLDB ENDOWMENT Aberger, C. R., Lamb, A., Olukotun, K., Re, C. 2017; 10 (12): 1849–52

View details for Web of Science ID 000416494000024
Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent De Sa, C., Feldman, M., Re, C., Olukotun, K., Assoc Comp Machinery ASSOC COMPUTING MACHINERY. 2017: 561–74

Abstract

Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called Buckwild! that uses both asynchronous execution and low-precision computation. We introduce the DMGC model, the first conceptualization of the parameter space that exists when implementing low-precision SGD, and show that it provides a way to both classify these algorithms and model their performance. We leverage this insight to propose and analyze techniques to improve the speed of low-precision SGD. First, we propose software optimizations that can increase throughput on existing CPUs by up to 11×. Second, we propose architectural changes, including a new cache technique we call an obstinate cache, that increase throughput beyond the limits of current-generation hardware. We also implement and analyze low-precision SGD on the FPGA, which is a promising alternative to the CPU for future SGD systems.

View details for PubMedID 29391770

View details for PubMedCentralID PMC5789782
EmptyHeaded: A Relational Engine for Graph Processing. Proceedings. ACM-Sigmod International Conference on Management of Data Aberger, C. R., Tu, S., Olukotun, K., Ré, C. 2016; 2016: 431-446

Abstract

There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded's design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3× worse performance on SSSP.

View details for DOI 10.1145/2882903.2915213

View details for PubMedID 28077912
Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling. JMLR workshop and conference proceedings De Sa, C., Olukotun, K., Ré, C. 2016; 48: 1567-1576

Abstract

Gibbs sampling is a Markov chain Monte Carlo technique commonly used for estimating marginal distributions. To speed up Gibbs sampling, there has recently been interest in parallelizing it by executing asynchronously. While empirical results suggest that many models can be efficiently sampled asynchronously, traditional Markov chain analysis does not apply to the asynchronous case, and thus asynchronous Gibbs sampling is poorly understood. In this paper, we derive a better understanding of the two main challenges of asynchronous Gibbs: bias and mixing time. We show experimentally that our theoretical results match practical outcomes.

View details for PubMedID 28344730
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms. Advances in neural information processing systems De Sa, C., Zhang, C., Olukotun, K., Ré, C. 2015; 28: 2656-2664

Abstract

Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variety of machine learning problems. Researchers and industry have developed several techniques to optimize SGD's runtime performance, including asynchronous execution and reduced precision. Our main result is a martingale-based analysis that enables us to capture the rich noise models that may arise from such techniques. Specifically, we use our new analysis in three ways: (1) we derive convergence rates for the convex case (Hogwild!) with relaxed assumptions on the sparsity of the problem; (2) we analyze asynchronous SGD algorithms for non-convex matrix problems including matrix completion; and (3) we design and analyze an asynchronous SGD algorithm, called Buckwild!, that uses lower-precision arithmetic. We show experimentally that our algorithms run efficiently for a variety of problems on modern hardware.

View details for PubMedID 27330264
Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width. Advances in neural information processing systems De Sa, C., Zhang, C., Olukotun, K., Ré, C. 2015; 28: 3079-3087

Abstract

Gibbs sampling on factor graphs is a widely used inference technique, which often produces good empirical results. Theoretical guarantees for its performance are weak: even for tree structured graphs, the mixing time of Gibbs may be exponential in the number of variables. To help understand the behavior of Gibbs sampling, we introduce a new (hyper)graph property, called hierarchy width. We show that under suitable conditions on the weights, bounded hierarchy width ensures polynomial mixing time. Our study of hierarchy width is in part motivated by a class of factor graph templates, hierarchical templates, which have bounded hierarchy width-regardless of the data used to instantiate them. We demonstrate a rich application from natural language processing in which Gibbs sampling provably mixes rapidly and achieves accuracy that exceeds human volunteers.

View details for PubMedID 27279724
Beyond Parallel Programming with Domain Specific Languages ACM SIGPLAN NOTICES Olukotun, K. 2014; 49 (8): 179-179

View details for DOI 10.1145/2555243.2557966

View details for Web of Science ID 000349142100016
Delite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS Sujeeth, A. K., Brown, K. J., Lee, H., Rompf, T., Chafi, H., Odersky, M., Olukotun, K. 2014; 13

View details for DOI 10.1145/2584665

View details for Web of Science ID 000341390100017
Surgical Precision JIT Compilers 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) Rompf, T., Sujeeth, A. K., Brown, K. J., Lee, H., Chafi, H., Olukotun, K. ASSOC COMPUTING MACHINERY. 2014: 41–52

View details for DOI 10.1145/2594291.2594316

View details for Web of Science ID 000344455800008
Forge: Generating a High Performance DSL Implementation from a Declarative Specification ACM SIGPLAN NOTICES Sujeeth, A. K., Gibbons, A., Brown, K. J., Lee, H., Rompf, T., Odersky, M., Olukotun, K. 2014; 49 (3): 145-154

View details for DOI 10.1145/2517208.2517220

View details for Web of Science ID 000338625500017
Optimizing Data Structures in High-Level Programs New Directions for Extensible Compilers based on Staging ACM SIGPLAN NOTICES Rompf, T., Sujeeth, A. K., Amin, N., Brown, K. J., Jovanovic, V., Lee, H., Jonnalagedda, M., Olukotun, K., Odersky, M. 2013; 48 (1): 497-510

View details for DOI 10.1145/2480359.2429128

View details for Web of Science ID 000318629900042
High Performance Embedded Domain Specific Languages ACM SIGPLAN NOTICES Olukotun, K. 2012; 47 (9): 139-139

View details for DOI 10.1145/2398856.2364548

View details for Web of Science ID 000311296000014
Green-Marl: A DSL for Easy and Efficient Graph Analysis ACM SIGPLAN NOTICES Hong, S., Chafi, H., Sedlar, E., Olukotun, K. 2012; 47 (4): 349-362

View details for Web of Science ID 000209339300029
Green-Marl: A DSL for Easy and Efficient Graph Analysis Hong, S., Chafi, H., Sedlar, E., Olukotun, K. 2012
IMPLEMENTING DOMAIN-SPECIFIC LANGUAGES FOR HETEROGENEOUS PARALLEL COMPUTING IEEE MICRO Lee, H., Brown, K. J., Sujeeth, A. K., Chafi, H., Olukotun, K., Rompf, T., Odersky, M. 2011; 31 (5): 42-52

View details for Web of Science ID 000295883700006
Accelerating CUDA Graph Algorithms at Maximum Warp ACM SIGPLAN NOTICES Hong, S., Kim, S. K., Oguntebi, T., Olukotun, K. 2011; 46 (8): 267-276

View details for Web of Science ID 000296264900027
A Domain-Specific Approach To Heterogeneous Parallelism ACM SIGPLAN NOTICES Chafi, H., Sujeeth, A. K., Brown, K. J., Lee, H., Atreya, A. R., Olukotun, K. 2011; 46 (8): 35-45

View details for Web of Science ID 000296264900005
Hardware Acceleration of Transactional Memory on Commodity Systems ACM SIGPLAN NOTICES Casper, J., Oguntebi, T., Hong, S., Bronson, N. G., Kozyrakis, C., Olukotun, K. 2011; 46 (3): 27-38

View details for DOI 10.1145/1961296.1950372

View details for Web of Science ID 000290854400004
Implementing Domain-Specific Languages for Heterogeneous Parallel Computing IEEE Micro: Special Issue on CPU, GPU, and Hybrid Computing Lee, H., Brown, Kevin, J., Sujeeth, Arvind, K., Chafi, H., Rompf, T., Odersky, M., Olukotun, Oyekunle, A. 2011
Hardware Acceleration of Transactional Memory on Commodity Systems Casper, J., Oguntebi, T., Hong, S., Bronson, Nathan, G., Kozyrakis, C., Olukotun, K. 2011
Accelerating CUDA Graph Algorithms at Maximum Warp Hong, S., Kim, S. K., Oguntebi, T., Olukotun, K. 2011
A Domain-Specific Approach to Heterogeneous Parallelism Chafi, H., Sujeeth, Arvind, K., Brown, Kevin, J., Lee, H., Atreya, Anand, R., Olukotun, K. 2011
Building-Blocks for Performance Oriented DSLs Rompf, T., Sujeeth, Arvind, K., Lee, H., Brown, Kevin, J., Chafi, H., Odersky, M., Olukotun, Oyekunle, A. 2011
OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning Sujeeth, Arvind, K., Lee, H., Brown, Kevin, J., Rompf, T., Chafi, H., Wu, M., Olukotun, Oyekunle, A. 2011
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU Hong, S., Oguntebi, T., Olukotun, K. 2011
A Heterogeneous Parallel Framework for Domain-Specific Languages Brown, Kevin, J., Sujeeth, Arvind, K., Lee, H., Rompf, T., Chafi, H., Odersky, M., Olukotun, Oyekunle, A. 2011
Language Virtualization for Heterogeneous Parallel Computing Conference on Object Oriented Programming Systems, Languages and Applications/SPLASH 2010 Chafi, H., DeVito, Z., Moors, A., Rompf, T., Sujeeth, A. K., Hanrahan, P., Odersky, M., Olukotun, K. ASSOC COMPUTING MACHINERY. 2010: 835–47

View details for DOI 10.1145/1932682.1869527

View details for Web of Science ID 000286595800051
A Practical Concurrent Binary Search Tree ACM SIGPLAN NOTICES Bronson, N. G., Casper, J., Chafi, H., Olukotun, K. 2010; 45 (5): 257-268

View details for Web of Science ID 000280548100024
UBIQUITOUS PARALLEL COMPUTING FROM BERKELEY, ILLINOIS, AND STANFORD IEEE MICRO Catanzaro, B., Fox, A., Keutzer, K., Patterson, D., Su, B., Snir, M., Olukotun, K., Hanrahan, P., Chafi, H. 2010; 30 (2): 41-55

View details for Web of Science ID 000276473900006
A Large-scale Architecture for Restricted Boltzmann Machines Kim, S. K., McMahon, Peter, L., Olukotun, K. 2010
FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures Oguntebi, T., Hong, S., Casper, J., Bronson, N., Kozyrakis, C., Olukotun, K. 2010
Implementing and Evaluating Nested Parallel Transactions in Software Transactional Memory Baek, W., Bronson, N., Kozyrakis, C., Olukotun, K. 2010
Transactional Predication: High-Performance Concurrent Sets and Maps for STM Bronson, Nathan, G., Casper, J., Chafi, H., Olukotun, K. 2010
EigenBench: A Simple Exploration Tool for Orthogonal TM Characterisitics Hong, S., Oguntebi, T., Casper, J., Bronson, N., Koyrakis, C., Olukotun, K. 2010
CCSTM: A Library-Based STM for Scala Bronson, Nathan, G., Chafi, H., Olukotun, K. 2010
Making Nested Parallel Transactions Practical using Lightweight Hardware Support Baek, W., Bronson, N., Kozyrakis, C., Olukotun, K. 2010
Language Virtualization for Heterogeneous Parallel Computing Chafi, H., DeVito, Z., Moors, A., Rompf, T., Sujeeth, Arvind, K., Hanrahan, P., Olukotun, Oyekunle, A. 2010
Implementing and Evaluating a Model Checker for Transactional Memory Systems Baek, W., Bronson, Nathan, G., Kozyrakis, C., Olukotun, K. 2010
A Practical Concurrent Binary Search Tree. Bronson, Nathan, G., Casper, J., Chafi, H., Olukotun, K. 2010
A Highly Scalable Restricted Boltzmann Machine FPGA Implementation Kim, S. K., McAfee, Lawrence, C., McMahon, Peter, L., Olukotun, K. 2009
Feedback-Directed Barrier Optimization in a Strongly Isolated STM ACM SIGPLAN NOTICES Bronson, N. G., Kozyrakis, C., Olukotun, K. 2009; 44 (1): 213-225

View details for Web of Science ID 000272013800020
Feedback-Directed Barrier Optimization in a Strongly Isolated STM Bronson, Nathan, G., Kozyrakis, C., Olukotun, K. 2009
Improving Software Concurrency with Hardware-assisted Memory Snapshot 20th ACM Symposium on Parallelism in Algorithms and Architectures Chung, J., Seo, J., Baek, W., Minh, C. C., McDonald, A., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2008: 363–363

View details for Web of Science ID 000266217200050
STAMP: Stanford Transactional Applications for Multi-Processing IEEE International Symposium on Workload Characterization Minh, C. C., Chung, J., Kozyrakis, C., Olukotun, K. IEEE. 2008: 31–42

View details for Web of Science ID 000263063500004
ASeD: Availability, Security, and Debugging Support using Transactional Memory 20th ACM Symposium on Parallelism in Algorithms and Architectures Chung, J., Baek, W., Bronson, N. G., Seo, J., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2008: 366–366

View details for Web of Science ID 000266217200053
Transactional memory: The hardware-software interface IEEE MICRO McDonald, A., Carlstrom, B. D., Chung, J., Minh, C. C., Chafi, H., Kozyrakis, C., Olukotun, K. 2007; 27 (1): 67-76

View details for Web of Science ID 000246455000009
An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees 34th Annual International Symposium on Computer Architecture Minh, C. C., Trautmann, M., Chung, J., McDonald, A., Bronson, N., Casper, J., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2007: 69–80

View details for Web of Science ID 000265786200007
Transactional Collection Classes ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Carlstrom, B. D., McDonald, A., Carbin, M., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2007: 56–67

View details for Web of Science ID 000266870900006
A Practical FPGA-based Framework for Novel CMP Research 15th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Wee, S., Casper, J., Njoroge, N., Tesylar, Y., Ge, D., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2007: 116–125

View details for Web of Science ID 000268330100013
Towards Soft Optimization Techniques for Parallel Cognitive Applications 19th Annual Symposium on Parallelism in Algorithms and Architectures Baek, W., Chung, J., Minh, C. C., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2007: 59–60

View details for Web of Science ID 000266371200009
A scalable, non-blocking approach to transactional memory 13th International Symposium on High-Performance Computer Architecture Chafi, H., Casper, J., Carlstrom, B. D., McDonald, A., Minh, C. C., Baek, W., Kozyrakis, C., Olukotun, K. IEEE COMPUTER SOC. 2007: 97–108

View details for Web of Science ID 000245463100010
ATLAS: A chip-multiprocessor with Transactional Memory support Design, Automation and Test in Europe Conference and Exhibition (DATE 07) Njoroge, N., Casper, J., Wee, S., Teslyar, Y., Ge, D., Kozyrakis, C., Olukotun, K. IEEE. 2007: 3–8

View details for Web of Science ID 000252175700001
Executing Java programs with transactional memory OOPSLA Workshop on Synchronization and Concurrent in Object-Oriented Languages Carlstrom, B. D., Chung, J., Chafi, H., McDonald, A., Minh, C. C., Hammond, L., Kozyrakis, C., Olukotun, K. ELSEVIER SCIENCE BV. 2006: 111–29

View details for DOI 10.1016/j.scico.2006.05.006

View details for Web of Science ID 000241921200002
Tradeoffs in transactional memory virtualization ACM SIGPLAN NOTICES Chung, J., Minh, C. C., McDonald, A., Skare, T., Chafi, H., Carlstrom, B. D., Kozyrakis, C., Olukotun, K. 2006; 41 (11): 371-381

View details for Web of Science ID 000202972600035
The ATOMO Sigma transactional programming language ACM SIGPLAN NOTICES Carlstrom, B. D., McDonald, A., Chafi, H., Chung, J., Minh, C. C., Kozyrakis, C., Olukotun, K. 2006; 41 (6): 1-13

View details for Web of Science ID 000202972100001
The Atomos Transactional Programming Language Carlstrom, Brian, D., McDonald, A., Chafi, H., Chung, J., Minh, C. C., Kozyrakis, C., Olukotun, Oyekunle, A. 2006
Architectural semantics for practical Transactional Memory 33rd International Symposium on Computer Architecture McDonald, A., Chung, J., Carlstrom, B. D., Minh, C. C., Chafi, H., Kozyrakis, C., Olukotun, K. IEEE COMPUTER SOC. 2006: 53–64

View details for Web of Science ID 000238976500005
The common case transactional behavior of multithreaded programs 12th International Symposium on High-Performance Computer Architecture Chung, J., Chafi, H., Minh, C. C., McDonald, A., Carlstrom, B., Kozyrakis, C., Olukotun, K. IEEE COMPUTER SOC. 2006: 271–282

View details for Web of Science ID 000237200400026
The Common Case Transactional Behavior of Multithreaded Programs Chung, J., Chafi, H., Minh, C. C., McDonald, A., Carlstrom, Brian, D., Kozyrakis, C., Olukotun, Oyekunle, A. 2006
Architectural Semantics for Practical Transactional Memory McDonald, A., Chung, J., Carlstrom, Brian, D., Minh, C. C., Chafi, H., Kozyrakis, C., Olukotun, Oyekunle, A. 2006
The Software Stack for Transactional Memory: Challenges and Opportunities Carlstrom, Brian, D., Chung, J., Kozyrakis, C., Olukotun, K. 2006
Tradeoffs in Transactional Memory Virtualizations Chung, J., Minh, C. C., McDonald, A., Chafi, H., Carlstrom, Brian, D., Skare, T., Olukotun, Oyekunle, A. 2006
Niagara: A 32-way multithreaded SPARC processor IEEE MICRO Kongetira, P., Aingaran, K., Olukotun, K. 2005; 25 (2): 21-29

View details for Web of Science ID 000228487000004
The Future of Microprocessors ACM QUEUE Magazine Olukotun, K., Hammond, L. 2005
Maximizing CMP throughput with mediocre cores PACT 2005: 14TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES Davis, J. D., Laudon, J., Olukotun, K. 2005: 51-62

View details for Web of Science ID 000233637100005
A new approach to programming and prototyping parallel systems HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS Olukotun, K. 2005; 3769: 4-4

View details for Web of Science ID 000235801700003
Characterization of TCC on chip-multiprocessors 14th International Conference on Parallel Architectures and Compilation Techniques McDonald, A., Chung, J. W., Chafi, H., Minh, C. C., Carlstrom, B. D., Hammond, L., Kozyrakis, C., Olukotun, K. IEEE COMPUTER SOC. 2005: 63–74

View details for Web of Science ID 000233637100006
Maximizing CMP Throughput with Mediocre Cores Davis, John, D., Laudon, J., Olukotun, K. 2005
TAPE: A Transactional Application Profiling Environment Chafi, H., Minh, C. C., McDonald, A., Carlstrom, Brian, D., Chung, J., Hammond, L., Olukotun, Oyekunle, A. 2005
Article about Kunle Olukuton's Niagara processor: Sun's Big Splash IEEE Spectrum Magazine Olukotun, K., Geppert, L. 2005
Transactional Execution of Java Programs Carlstrom, Brian, D., Chung, J., Chafi, H., McDonald, A., Minh, C. C., Hammond, L., Olukotun, Oyekunle, A. 2005
Exposing Speculative Thread Parallelism in SPEC2000 Prabhu, M., Olukotun, K. 2005
Characterization of TCC on Chip-Multiprocessors McDonald, A., Chung, J., Chafi, H., Minh, C. C., Carlstrom, Brian, D., Hammond, L., Olukotun, Oyekunle, A. 2005
Transactional coherence and consistency: Simplifying parallel hardware and software IEEE MICRO Hammond, L., Carlstrom, B. D., Wong, V., Chen, M., Kozyrakis, C., Olukotun, K. 2004; 24 (6): 92-103

View details for Web of Science ID 000226365900013
Programming with transactional coherence and consistency (TCC) 11th International Conference on Architectural Support for Programming Languages and Operating Systems Hammond, L., Carlstrom, B. D., Wong, V., Hertzberg, B., Chen, M., Kozyrakis, C., Olukotun, K. ASSOC COMPUTING MACHINERY. 2004: 1–13

View details for Web of Science ID 000228341700003
Transactional Coherence and Consistency: Simplifying Parallel Hardware and Software Micro's Top Picks, IEEE Micro Hammond, L., Carlstrom, Brian, D., Wong, V., Chen, M., Kozyrakis, C., Olukotun, K. 2004; 24 (6)
Transactional memory coherence and consistency 31st Annual International Symposium on Computer Architecture Hammond, L., Wong, V., Chen, M., Carlstrom, B. D., Davis, J. D., Hertzberg, B., Prabhu, M. K., Wijaya, H., Kozyrakis, C., Olukotun, K. IEEE COMPUTER SOC. 2004: 102–113

View details for Web of Science ID 000222915900009
Niagara: A 32-Way Multithreaded SPARC Processor IEEE MICRO Magazine, March-April 2005, and presented at Hot Chips Kongetira, P., Aingaran, K., Olukotun, K. 2004
Transactional Memory Coherence and Consistency Hammond, L., Wong, V., Chen, M., Hertzberg, B., Carlstrom, Brian, D., Davis, John, D., Olukotun, Oyekunle, A. 2004
Programming with Transactional Coherence and Consistency (TCC) Hammond, L., Carlstrom, Brian, D., Wong, V., Hertzberg, B., Chen, M., Kozyrakis, C., Olukotun, Oyekunle, A. 2004
The Jrpm system for dynamically parallelizing sequential Java programs IEEE MICRO Chen, M. K., Olukotun, K. 2003; 23 (6): 26-35

View details for Web of Science ID 000188257700006
Using thread-level speculation to simplify manual parallelization 9th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Prabhu, M. K., Olukotun, K. ASSOC COMPUTING MACHINERY. 2003: 1–12

View details for Web of Science ID 000187366900001
Using Thread-Level Speculation to Simplify Manual Parallelization Prabhu, M., Olukotun, K. 2003
The Jrpm system for dynamically parallelizing Java programs 30TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS Chen, M. K., Olukotun, K. 2003: 434-445

View details for Web of Science ID 000183763700037
TEST: A tracer for extracting speculative threads CGO 2003: INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION Chen, M., Olukotun, K. 2003: 301-312

View details for Web of Science ID 000182316800026
The Jrpm System for Dynamically Parallelizing Java Programs Chen, M., Olukotun, K. 2003
TEST: A Tracer for Extracting Speculative Threads Chen, M., Olukotun, K. 2003
The Jrpm System for Dynamically Parallelizing Java Programs Chen, M., Olukotun, K. 2003
Targeting dynamic compilation for embedded environments USENIX ASSOCIATION PROCEEDINGS OF THE 2ND JAVA(TM) VIRTUAL MACHINE RESEARCH AND TECHNOLOGY SYMPOSIUM Chen, M., Olukotun, K. 2002: 151-164

View details for Web of Science ID 000178400500013
Efficient state representation for symbolic simulation 39TH DESIGN AUTOMATION CONFERENCE, PROCEEDINGS 2002 Bertacco, V., Olukotun, K. 2002: 99-104

View details for Web of Science ID 000177213300018
High bandwidth on-chip cache design IEEE TRANSACTIONS ON COMPUTERS Wilson, K. M., Olukotun, K. 2001; 50 (4): 292-307

View details for Web of Science ID 000168145500002
The Stanford Hydra CMP IEEE MICRO Hammond, L., Hubbert, B. A., Siu, M., Prabhu, M. K., Chen, M., Olukotun, K. 2000; 20 (2): 71-84

View details for Web of Science ID 000086194900013
A single chip multiprocessor integrated with high density DRAM IEICE TRANSACTIONS ON ELECTRONICS Yamauchi, T., Hammond, L., Olukotun, O. A., Arimoto, K. 1999; E82C (8): 1567-1577

View details for Web of Science ID 000082243400030
REMARC: Reconfigurable multimedia array coprocessor IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS Miyamori, T., Olukotun, K. 1999; E82D (2): 389-397

View details for Web of Science ID 000079040600006
The Stanford Hydra CMP IEEE MICRO Magazine, March-April 2000, and presented at Hot Chips Hammond, L., Hubbert, B., Siu, M., Prabhu, M., Chen, M., Olukotun, K. 1999
Improving the Performance of Speculatively Parallel Applications on the Hydra CMP Olukotun, K., Hammond, L., Willey, M. 1999
Data speculation support for a chip multiprocessor ACM SIGPLAN NOTICES Hammond, L., Willey, R., Olukotun, K. 1998; 33 (11): 58-69

View details for Web of Science ID 000076778700008
Considerations in the Design of Hydra: A Multiprocessor-on-a-Chip Microarchitecture Stanford University Computer Systems Lab Technical Report CSL-TR-98-749 Hammond, L., Olukotun, K. 1998
Digital system simulation: Methodologies and examples 35th Design Automation Conference Olukotun, K., Heinrich, M., Ofelt, D. ASSOC COMPUTING MACHINERY. 1998: 658–663

View details for Web of Science ID 000077273700118
Exploiting method-level parallelism in single-threaded Java programs International Conference on Parallel Architectures and Compilation Techniques Chen, M. K., Olukotun, K. IEEE COMPUTER SOC. 1998: 176–184

View details for Web of Science ID 000076611700022
DCP: an algorithm for datapath/control partitioning of synthesizable RTL models International Conference on Computer Design: VLSI in Computers and Processors Lam, V. J., OLUKOTUN, K. A. I E E E, COMPUTER SOC PRESS. 1998: 442–449

View details for Web of Science ID 000076796900070
Data Speculation Support for a Chip Multiprocessor Hammond, L., Willey, M., Olukotun, K. 1998
Exploiting Method-Level Parallelism in Single-Threaded Java Programs Chen, M., Olukotun, K. 1998
Multilevel optimization of pipelined caches IEEE TRANSACTIONS ON COMPUTERS Olukotun, K., Mudge, T. N., Brown, R. B. 1997; 46 (10): 1093-1102

View details for Web of Science ID A1997YB64800004
A single-chip multiprocessor COMPUTER NAYFEH, B. A., Olukotun, K. 1997; 30 (9): 79-?

View details for Web of Science ID A1997XU01900018
A Single Chip Multiprocessor Integrated with DRAM Yamauchi, T., Hammond, L., Olukotun, K. 1997
Java as a specification language for hardware-software systems 1997 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 97) HELAIHEL, R., Olukotun, K. I E E E, COMPUTER SOC PRESS. 1997: 690–697

View details for Web of Science ID A1997BK01U00099
Verifying correct pipeline implementation for microprocessors 1997 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 97) LEVITT, J., Olukotun, K. I E E E, COMPUTER SOC PRESS. 1997: 162–169

View details for Web of Science ID A1997BK01U00026
Designing high bandwidth on-chip caches 24th Annual International Symposium on Computer Architecture Wilson, K. M., Olukotun, K. ASSOC COMPUTING MACHINERY. 1997: 121–132

View details for Web of Science ID A1997BH95B00011
A Single-Chip Multiprocessor IEEE Computer Special Issue on "Billion-Transistor Processors" Hammond, L., Nayfeh, Basem, A., Olukotun, K. 1997
Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor Stanford University Computer Systems Lab Technical Report CSL-TR-97-715 Oplinger, J., Heine, D., Liao, S., Nayfeh, Basem, A., Lam, M., Olukotun, K. 1997
The case for a single-chip multiprocessor ACM SIGPLAN NOTICES Olukotun, K., NAYFEH, B. A., Hammond, L., Wilson, K., Chang, K. Y. 1996; 31 (9): 2-11

View details for Web of Science ID A1996VM12800003
The Case for a Single-Chip Multiprocessor Olukotun, K., Nayfeh, Basem, A., Hammond, L., Wilson, K., Chang, K. 1996
A scalable formal verification methodology for pipelined microprocessors 33rd Design Automation Conference LEVITT, J., Olukotun, K. ASSOC COMPUTING MACHINERY. 1996: 558–563

View details for Web of Science ID A1996BF92A00111
The impact of shared-cache clustering in small-scale shared-memory multiprocessors 2nd International Symposium on High-Performance Computer Architecture (HPCA-2) NAYFEH, B. A., Olukotun, K., Singh, J. P. I E E E, COMPUTER SOC PRESS. 1996: 74–84

View details for Web of Science ID A1996BF28H00007
Evaluation of design alternatives for a multiprocessor microprocessor 23rd Annual International Symposium on Computer Architecture Nayfeh, E. A., Hammond, L., Olukotun, K. ASSOC COMPUTING MACHINERY. 1996: 67–77

View details for Web of Science ID A1996BF68U00007
Emulation and prototyping of digital systems NATO Advanced Study Institute on Hardware/Software Co-Design HELAIHEL, R., Olukotun, K. SPRINGER. 1996: 339–366

View details for Web of Science ID A1996BF04R00014
Increasing cache port efficiency for dynamic superscalar microprocessors 23rd Annual International Symposium on Computer Architecture Wilson, K. M., Olukotun, K., Rosenblum, M. ASSOC COMPUTING MACHINERY. 1996: 147–157

View details for Web of Science ID A1996BF68U00014
Evaluation of Design Alternatives for a Multiprocessor Microprocessor Nayfeh, Basem, A., Hammond, L., Olukotun, K. 1996
The benefits of clustering in shared address space multiprocessors: An applications-driven investigation 1995 ACM/IEEE Supercomputing Conference (SC 95) Erlichson, A., NAYFEH, B. A., Singh, J. P., Olukotun, K. ASSOC COMPUTING MACHINERY. 1995: 1674–1704

View details for Web of Science ID A1995BH56H00055
A general method for compiling event driven simulations 32nd Design Automation Conference French, R. S., Lam, M. S., LEVITT, J. R., Olukotun, K. ASSOC COMPUTING MACHINERY. 1995: 151–156

View details for Web of Science ID A1995BD41Y00026
A SOFTWARE-HARDWARE COSYNTHESIS APPROACH TO DIGITAL SYSTEM SIMULATION IEEE MICRO OLUKOTUN, K. A., HELAIHEL, R., LEVITT, J., Ramirez, R. 1994; 14 (4): 48-58

View details for Web of Science ID A1994NZ48900009
Rationale and Design of the Hydra Multiprocessor Stanford University Computer Systems Lab Technical Report CSL-TR-94-645 Olukotun, K., Bergmann, J., Chang, K., Nayfeh, Basem, A. 1994
EXPLORING THE DESIGN SPACE FOR A SHARED-CACHE MULTIPROCESSOR 21st Annual International Symposium on Computer Architecture NAYFEH, B. A., Olukotun, K. I E E E, COMPUTER SOC PRESS. 1994: 166–175

View details for Web of Science ID A1994BA93B00015
ANALYSIS AND DESIGN OF LATCH-CONTROLLED SYNCHRONOUS DIGITAL CIRCUITS IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS Sakallah, K. A., Mudge, T. N., Olukotun, O. A. 1992; 11 (3): 322-333

View details for Web of Science ID A1992HB92700004
THE DESIGN OF A MICROSUPERCOMPUTER COMPUTER Mudge, T. N., Brown, R. B., Birmingham, W. P., DYKSTRA, J. A., Kayssi, A. I., Lomax, R. J., Olukotun, O. A., Sakallah, K. A., MILANO, R. A. 1991; 24 (1): 57-64

View details for Web of Science ID A1991ER66000009
IMPLEMENTING A CACHE FOR A HIGH-PERFORMANCE GAAS MICROPROCESSOR 18TH ANNUAL INTERNATIONAL SYMP ON COMPUTER ARCHITECTURE Olukotun, O. A., Mudge, T. N., Brown, R. B. ASSOC COMPUTING MACHINERY. 1991: 138–147

View details for Web of Science ID A1991BT52Q00014
HIERARCHICAL GATE-ARRAY ROUTING ON A HYPERCUBE MULTIPROCESSOR JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING Olukotun, O. A., Mudge, T. N. 1990; 8 (4): 313-324

View details for Web of Science ID A1990CY87900003
INTERCONNECTING OFF-THE-SHELF MICROPROCESSORS AFIPS CONFERENCE PROCEEDINGS ALSADOUN, H. B., Olukotun, O. A., Mudge, T. N. 1985; 54: 175-?

View details for Web of Science ID A1985ANT7800024
Plasticine: A Reconfigurable Architecture For Parallel Patterns ISCA '17: 44th International Symposium on Computer Architecture, June 2017 Prabhakar, R., Zhang, Y., Koeplinger, D., Feldman, M., Zhao, T., Hadjis, S., Pedram, A., Kozyrakis, C., Olukotun, K. 2017

View details for DOI 10.1145/3079856.3080256

Kunle Olukotun

Cadence Design Systems Professor, Professor of Electrical Engineering and of Computer Science

Bio

Academic Appointments

Honors & Awards

Professional Education

Contact

Additional Info

Links

2025-26 Courses

2024-25 Courses

2023-24 Courses

2022-23 Courses

Stanford Advisees

All Publications

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract