2025

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

Quanfeng Lu*, Zhantao Ma*, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo (* equal contribution)

arXiv preprint 2025

SWIRL: A multi-agent RL framework with interleaved updates. Key contributions: (1) Theoretical guarantees & convergence proofs, (2) O(1) memory efficiency, (3) SOTA zero-shot GUI control with only 3.5K examples. Transferable across domains (GUI + math reasoning).

2024

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

Xijia Tao*, Shuai Zhong*, Lei Li*, Qi Liu, Lingpeng Kong (* equal contribution)

North American Chapter of the Association for Computational Linguistics (NAACL) 2025 Oral

ImgTrojan: VLM jailbreaking with ONE poisoned image. Key contributions: (1) Training-time attack via malicious image captions, (2) Comprehensive VLM safety evaluation metrics, (3) Systematic vulnerability analysis. Successfully bypasses safety mechanisms across multiple VLM architectures.