Hi there. I'm Shuai Zhong (鍾帥, /dʒʊŋ1 ʃuaɪ4/), you can also call me Chris. I'm a M.Phil. student @ HKU MMLab, The University of Hong Kong. My research focuses on developing autonomous agents that can perceive, reason, and act in complex multimodal environments - particularly in GUI control and collaborative multi-agent scenarios. 🤖
My work spans from adversarial robustness in VLMs (exploring novel jailbreaking paradigms) to emergent behaviors in multi-agent reinforcement learning systems. Currently, I'm exploring the intersection of Vision-Language Foundation Models and Agentic AI Systems under the supervision of Dr. Ping Luo and co-supervision of Dr. Lingpeng Kong @ HKU-CDS. I also had the privilege of working with Dr. Xihui Liu as my UG FYP advisor @ HKU-EEE.
") does not match the recommended repository name for your site ("").
", so that your site can be accessed directly at "http://".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}" in index.html.
",
which does not match the baseurl ("") configured in _config.yml.
baseurl in _config.yml to "".

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)
arXiv preprint 2025
SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).
Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)
arXiv preprint 2025
SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)
North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral
ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.
Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)
North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral
ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.