Homepage - Shuai Zhong

Probably thinking about how to make AI agents more cooperative (while less likely to take over the world :P

Shuai Zhong (鍾帥)

M.Phil. Student @ HKU-CDS
Agentic AI Explorer

Hi there. I'm Shuai Zhong (鍾帥, /dʒʊŋ1 ʃuaɪ4/), you can also call me Chris. I'm a M.Phil. student @ HKU MMLab, The University of Hong Kong. My research focuses on developing autonomous agents that can perceive, reason, and act in complex multimodal environments - particularly in GUI control and collaborative multi-agent scenarios. 🤖

My work spans from adversarial robustness in VLMs (exploring novel jailbreaking paradigms) to emergent behaviors in multi-agent reinforcement learning systems. Currently, I'm exploring the intersection of Vision-Language Foundation Models and Agentic AI Systems under the supervision of Dr. Ping Luo and co-supervision of Dr. Lingpeng Kong @ HKU-CDS. I also had the privilege of working with Dr. Xihui Liu as my UG FYP advisor @ HKU-EEE.

chris_zsa(at)connect.hku.hk Google Scholar GitHub

Education

The University of Hong Kong

M.Phil. in Computer Science

Sep. 2025 - Jul. 2027 (expected)
The University of Hong Kong

B.Eng. in Computer Engineering

Sep. 2020 - Jul. 2025

Experience

TCL Corporate Research (HK) Co., Ltd.

Research Intern

May 2025 - Present
HKU MMLab

Research Assistant

May 2025 - Present
HKU NLP Lab

Research Assistant

Sep. 2023 - Jan. 2025
Hangzhou Raycloud Technology Co., Ltd.

Research Intern

Jul. 2022 - Aug. 2022

News

2025

✈️ Heading to ICCV 2025 in Hawaii from Oct 19-26! Excited to connect with the computer vision community and explore the latest research. Looking forward to great discussions!

Oct 15

🎓 Started my M.Phil. journey at HKU MMLab, supervised by Dr. Ping Luo.

Sep 01

📝 Released our preprint: "SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control". Credit to the guidance from Quanfeng and Zhantao! arXiv

Aug 27

🎬 Joined HKU MMLab as Research Assistant, working on decoupled multi-agent RL for GUI control under Dr. Ping Luo.

May 15

🎯 Successfully completed my Senior Design Project defense and will be graduating from HKU soon! Grateful for the invaluable guidance from Dr. Xihui Liu and Dr. Hongyang Du throughout this journey.

Apr 20

✈️ Excited to present our ImgTrojan work at NAACL 2025 in Albuquerque this May! My first academic conference - can't wait to meet fellow researchers and exchange ideas!

Mar 15

2024

📁 Thrilled that our paper "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image" got accepted at NAACL 2025 as an Oral presentation! Congrats to Xijia and Lei! Paper

Dec 15

🎯 Started my Senior Design Project "BATIP: Bias-Aware Text-to-Image Pipeline" supervised by Dr. Xihui Liu.

Sep 01

📝 Released our preprint: "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image". Grateful for the wonderful collaboration with Xijia and Lei! arXiv

Mar 05

2023

🏛️ Joined HKU NLP Lab as Research Assistant, supervised by Dr. Lingpeng Kong.

Sep 15

Selected Publications (view all )

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

[Paper] [Code]

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

[Paper] [Code]

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.

[Paper] [Code] [ACL Anthology]

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

[Paper] [Code] [ACL Anthology]

Warning

Action required

Education

Experience

News

Selected Publications (view all )

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

All publications

Visitor Map