About Me
I am a PhD student at the Hong Kong University of Science and Technology (HKUST), advised by Prof. Fangzhen Lin, and in close collaboration with Prof. Wenhu Chen at the University of Waterloo. I am fortunate to be supported by the Hong Kong PhD Fellowship Scheme (HKPFS), with only 300 awardees across Hong Kong each year.
My research focuses on Large Language Models (LLMs) and Vision-Language Models (VLMs), Reasoning, RL and agents. My recent work includes developing RL-based approaches to enhance VLM and LLM reasoning, as seen in projects like VL-Rethinker, Autocode, ACECoder, and HTL. For a comprehensive list of my publications, please visit my Google Scholar.
Prior to joining HKUST, I worked as a Researcher at INF Technology, advised by Dr. Wei Chu, and as an AI Engineer at Alibaba, under the guidance of Dr. Chao Du. These experiences allowed me to deepen my expertise in AI and machine learning while contributing to impactful industry projects.
Prior to these working experiences, I was recognized as one of the Outstanding Graduates of Shanghai (top 1% province-wide) when studying at ShanghaiTech, and I was awarded the prestigious National Scholarship (top 0.2% nation-wide) at Wuhan University in 2017.
Notice
I am actively seeking research collaboration and research opportunities in VLMs, RL and agents, preferably remote. Let’s make real impacts to both the industry and the academia!
News
2025.04We explore how to incentivize deliberate thinking in VLMs, and introduce a powerful VL reasoner: VL-Rethinker. It achieves superior results on a diverse collection of multimodal benchmarks.
2025.03We release a new diffusion quantization method: TR-DQ.
2025.02We release Autocode on metacognitive tool-use LLMs for math, and ACECoder for large-scale test-case synthesis for coder RL training.
2025.01Our paper RenderWorld on 3D world models is accepted to ICRA 2025. Congrats to Yihua and Ziyan!
2024.10Our V-PETL Benchmark is accepted to NeurIPS 2025.
2024.08Our HTL on tool-integrated reasoning for math is accepted to EMNLP 2025.