Zonghuan Xu

Fudan University, School of Mathematical Sciences. B.S. in Mathematics and Applied Mathematics, Xianghui Plan, expected 2028.

Contact: 2430XH10002@m.fudan.edu.cn

I am broadly interested in building more capable and trustworthy AI systems by revisiting fundamental assumptions and developing new problem formulations. My recent work spans embodied AI safety, human-grounded evaluation, continual-learning theory, and Human Model for trustworthy AI.

A recurring theme in my work is problem reframing: understanding VLA backdoors as action-level malicious primitives, disinformation evaluation as a question of human reader risk, forgetting as a task-distribution phenomenon, and human-related AI methods as part of a broader Human Model framework.

I am interested in Human Models as a long-term research direction, especially when suitable data, resources, or collaborations become available. More broadly, I aim to develop rigorous research on trustworthy AI and learning systems, combining theoretical analysis with empirical and system-level perspectives.

publications

DropVLA: An Action-Level Backdoor Attack on Vision-Language-Action Models

Xu, Z., Li, J., Zhao, Y., Zheng, X., Ma, X., and Jiang, Y.-G.

arXiv:2510.10932v4, 2026. Under review at IROS 2026.

AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models

Li, J., Zhao, Y., Zheng, X., Xu, Z., Li, Y., Ma, X., and Jiang, Y.-G.

arXiv:2511.12149v1, 2025.

Beyond Surface Judgments: Human-Grounded Risk Evaluation of LLM-Generated Disinformation

Xu, Z., Zheng, X., Wu, Y., and Ma, X.

arXiv:2604.06820v1, 2026. Under review at EMNLP 2026.

From Order to Distribution: A Spectral Characterization of Forgetting in Continual Learning

Xu, Z. and Ma, X.

arXiv:2604.13460v1, 2026. Under review at NeurIPS 2026.

Human Model: The Missing Piece Toward Trustworthy AGI

Xu, Z., Ma, X., and Jiang, Y.-G.

OpenReview Archive, 2026. Position paper. Under review at NeurIPS 2026 Position Paper Track.

additional projects

World-Model Inputs for Atari Policies

Tested whether Atari policies benefit from world-model summaries in addition to raw observations, including a 26-game x 2-seed Atari100K experiment.

code

Fudan Sports Reservation Automation

Built a CPU-only reservation system with CAPTCHA data collection, neural recognition, segmentation, ordering algorithms, automated booking, and failure logging.

code

Learned Iterative Refinement for Quadratic-Root Prediction

Explored whether neural networks can act as iterative numerical solvers through one-shot baselines, learned refinement blocks, and an RL variant.

code