Shuai Zhong (鍾帥)
M.Phil. Student @ HKU-CDS
Agentic AI Explorer

Hi there. I'm Shuai Zhong (鍾帥, /dʒʊŋ1 ʃuaɪ4/), you can also call me Chris. I'm a M.Phil. student @ HKU MMLab, The University of Hong Kong. My research focuses on developing autonomous agents that can perceive, reason, and act in complex multimodal environments - particularly in GUI control and collaborative multi-agent scenarios. 🤖

My work spans from adversarial robustness in VLMs (exploring novel jailbreaking paradigms) to emergent behaviors in multi-agent reinforcement learning systems. Currently, I'm exploring the intersection of Vision-Language Foundation Models and Agentic AI Systems under the supervision of Dr. Ping Luo and co-supervision of Dr. Lingpeng Kong @ HKU-CDS. I also had the privilege of working with Dr. Xihui Liu as my UG FYP advisor @ HKU-EEE.


Education
  • The University of Hong Kong
    The University of Hong Kong
    M.Phil. in Computer Science
    Sep. 2025 - Jul. 2027 (expected)
  • The University of Hong Kong
    The University of Hong Kong
    B.Eng. in Computer Engineering
    Sep. 2020 - Jul. 2025
Experience
  • TCL Corporate Research (HK) Co., Ltd.
    TCL Corporate Research (HK) Co., Ltd.
    Research Intern
    May 2025 - Present
  • HKU MMLab
    HKU MMLab
    Research Assistant
    May 2025 - Present
  • HKU NLP Lab
    HKU NLP Lab
    Research Assistant
    Sep. 2023 - Jan. 2025
  • Hangzhou Raycloud Technology Co., Ltd.
    Hangzhou Raycloud Technology Co., Ltd.
    Research Intern
    Jul. 2022 - Aug. 2022
News
2025
✈️ Heading to ICCV 2025 in Hawaii from Oct 19-26! Excited to connect with the computer vision community and explore the latest research. Looking forward to great discussions!
Oct 15
🎓 Started my M.Phil. journey at HKU MMLab, supervised by Dr. Ping Luo.
Sep 01
📝 Released our preprint: "SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control". Credit to the guidance from Quanfeng and Zhantao! arXiv
Aug 27
🎬 Joined HKU MMLab as Research Assistant, working on decoupled multi-agent RL for GUI control under Dr. Ping Luo.
May 15
🎯 Successfully completed my Senior Design Project defense and will be graduating from HKU soon! Grateful for the invaluable guidance from Dr. Xihui Liu and Dr. Hongyang Du throughout this journey.
Apr 20
✈️ Excited to present our ImgTrojan work at NAACL 2025 in Albuquerque this May! My first academic conference - can't wait to meet fellow researchers and exchange ideas!
Mar 15
2024
📁 Thrilled that our paper "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image" got accepted at NAACL 2025 as an Oral presentation! Congrats to Xijia and Lei! Paper
Dec 15
🎯 Started my Senior Design Project "BATIP: Bias-Aware Text-to-Image Pipeline" supervised by Dr. Xihui Liu.
Sep 01
📝 Released our preprint: "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image". Grateful for the wonderful collaboration with Xijia and Lei! arXiv
Mar 05
2023
🏛️ Joined HKU NLP Lab as Research Assistant, supervised by Dr. Lingpeng Kong.
Sep 15
Selected Publications (view all )
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.

All publications

Visitor Map