Professional Education


  • Doctor of Philosophy, University of Iowa (2023)
  • MS, University of Iowa, Computer Science (2018)
  • BE, Harbin Institute of Technology, Bioinformatics (2016)

Stanford Advisors


All Publications


  • Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization. Conference on Neural Information Processing Systems (NeurIPS) Hu, Q., Zhu, D., Yang, T. 2023
  • Libauc: A deep learning library for x-risk optimization ACM SIGKDD Conference on Knowledge Discovery and Data Mining Yuan, Z., Zhu, D., Qiu, Z., Li, G., Wang, X., Yang, T. 2023
  • Deep unsupervised binary coding networks for multivariate time series retrieval AAAI Conference on Artificial Intelligence Zhu, D., Song, D., Chen, Y., Lumezanu, C., Cheng, W., Zong, B., Ni, J., Mizoguchi, T., Yang, T., Chen, H. 2020
  • deBWT: parallel construction of Burrows-Wheeler Transform for large collection of genomes with de Bruijn-branch encoding Liu, B., Zhu, D., Wang, Y. OXFORD UNIV PRESS. 2016: 174-182

    Abstract

    With the development of high-throughput sequencing, the number of assembled genomes continues to rise. It is critical to well organize and index many assembled genomes to promote future genomics studies. Burrows-Wheeler Transform (BWT) is an important data structure of genome indexing, which has many fundamental applications; however, it is still non-trivial to construct BWT for large collection of genomes, especially for highly similar or repetitive genomes. Moreover, the state-of-the-art approaches cannot well support scalable parallel computing owing to their incremental nature, which is a bottleneck to use modern computers to accelerate BWT construction.We propose de Bruijn branch-based BWT constructor (deBWT), a novel parallel BWT construction approach. DeBWT innovatively represents and organizes the suffixes of input sequence with a novel data structure, de Bruijn branch encoding. This data structure takes the advantage of de Bruijn graph to facilitate the comparison between the suffixes with long common prefix, which breaks the bottleneck of the BWT construction of repetitive genomic sequences. Meanwhile, deBWT also uses the structure of de Bruijn graph for reducing unnecessary comparisons between suffixes. The benchmarking suggests that, deBWT is efficient and scalable to construct BWT for large dataset by parallel computing. It is well-suited to index many genomes, such as a collection of individual human genomes, with multiple-core servers or clusters.deBWT is implemented in C language, the source code is available at https://github.com/hitbc/deBWT or https://github.com/DixianZhu/deBWTContact: ydwang@hit.edu.cnSupplementary data are available at Bioinformatics online.

    View details for DOI 10.1093/bioinformatics/btw266

    View details for Web of Science ID 000379734300020

    View details for PubMedID 27307614

    View details for PubMedCentralID PMC4908350