Xibin Zhou
I am a Ph.D. candidate in Computer Science at Westlake University (joint program with Zhejiang University), based in Hangzhou. My research focuses on AI for Science, especially protein language models (PLMs) and their applications in protein engineering, search, and design.
I develop large-scale models and open platforms that bridge protein sequence, structure, and function—including structure-aware pretraining (SaProt), trimodal contrastive learning (ProTrek), and community tools for training and sharing models (SaprotHub). See my full list on Google Scholar or the Publications page.
Research Interests
- Structure-aware protein language modeling — integrating 3D structural tokens with sequence for general-purpose PLMs
- Multimodal protein understanding — aligning sequence, structure, and natural-language function for search and annotation
- AI-driven protein engineering — enzyme optimization, base editing, de novo design, and autonomous computational workflows
- Democratizing PLM research — no-code training platforms and open collaboration for biologists
Selected Publications
* denotes equal contribution; † denotes corresponding author.
SaProt: Protein language modeling with structure-aware vocabulary
J Su, C Han, Y Zhou, J Shan, X Zhou, F Yuan — ICLR 2024
[Paper] · [Code]A trimodal protein language model enables advanced protein searches (ProTrek)
J Su*, Y He*, S You*, S Jiang, X Zhou, X Zhang, Y Wang, et al. — Nature Biotechnology, 2025
[Paper] · [Server] · [Code]Democratizing protein language model training, sharing and collaboration (SaprotHub)
J Su, Z Li, T Tao, C Han, Y He, F Dai, Q Yuan, Y Gao, T Si, X Zhang, X Zhou, et al. — Nature Biotechnology, 2025
[Paper] · [Code]Protein language models-assisted optimization of a uracil-N-glycosylase variant enables programmable T-to-G and T-to-C base editing
Y He*, X Zhou*, C Chang, G Chen, W Liu, G Li, X Fan, M Sun, C Miao, et al. — Molecular Cell, 2024
[Paper]ESM-Ezy: a deep learning strategy for the mining of novel multicopper oxidases with superior properties
H Qian*, Y Wang*, X Zhou*, T Gu, H Wang, H Lyu, Z Li, et al. — Nature Communications, 2025
[Paper] · [Code]Decoding the molecular language of proteins with Evolla
X Zhou, C Han, Y Zhang, H Du, J Tian, J Su, R Liu, K Zhuang, S Jiang, et al. — bioRxiv, 2025; under review at Nature
[Paper] · [Demo]Toward De Novo Protein Design from Natural Language (Pinal)
F Dai, S You, Y Zhu, Y Gao, L Fu, X Zhou, J Su, C Wang, Y Fan, et al. — bioRxiv, 2024; under review at Nature
[Paper] · [Demo]
News
- 2026 — Evolla under review at Nature (bioRxiv preprint)
- 2025 — Pinal under review at Nature (bioRxiv preprint)
- Oct 2025 — ProTrek published in Nature Biotechnology
- Oct 2025 — SaprotHub published in Nature Biotechnology
- Apr 2025 — ESM-Ezy published in Nature Communications
- Jan 2024 — SaProt accepted at ICLR 2024
