All Publications


  • CARAVAN: Practical Online Learning of In-Network ML Models with Labeling Agents Zhang, Q., Imran, A., Bardhi, E., Swamy, T., Zhang, N., Shahbaz, M., Olukotun, K., USENIX USENIX ASSOC. 2024: 325-345
  • CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving Liu, Y., Li, H., Cheng, Y., Ray, S., Huang, Y., Zhang, Q., Du, K., Yao, J., Lu, S., Ananthanarayanan, G., Maire, M., Hoffmann, H., Holtzman, A., Jiang, J., Assoc Computing Machinery ASSOC COMPUTING MACHINERY. 2024: 38-56
  • Server-Driven Video Streaming for Deep Learning Inference Du, K., Pervaiz, A., Yuan, X., Chowdhery, A., Zhang, Q., Hoffmann, H., Jiang, J., ACM ASSOC COMPUTING MACHINERY. 2020: 557-570