Biography
Zefan Cai, whose chinese name is 蔡泽凡, is a Ph.D student in University of Wisconsin - Madison. During his master, he did brode research about LLM topics. In his Ph.D, He mainly did research on sparse attention, KV cache compression and other inference optimization topics. He considers long-context as the most important problem in LLM.
Study
- Ph.D: University of Wisconsin - Madison
- Advisor: Junjie Hu
- September, 2024 - June, 2028 (Expected)
- M.S.: Peking University
- Advisor: Baobao Chang
- September, 2022 - June, 2024
- B.S.: Beijing Jiaotong University
- September, 2018 - June, 2022
Intern
- Adobe Research - San Jose, California
- Advisor: Jiuxiang Gu and Hao Tan
- May 2025 - August 2025 (Expected)
- Microsoft Azure
- Advisor: Wen Xiao
- Qwen - Beijing
- Advisor: Tianyu Liu and Keming Lu
News
- June 2025 I join Adobe Research as a research intern!
- May 2025 One paper (Adaptivestep) accepted by ICML 2025!
- May 2025 One paper (Math-Minos) accepted by ACL 2025!
- Jan. 2025 Four papers (DnD-Transformer, HeadKV, Omni-MATH, GDPO) accepted by ICLR 2025!
- August 2024 One paper (VeCAF) accepted by ACM Multimedia 2024!
- May 2025 Four papers (PCA-Bench, ZeroED, CENSOR, FairEval) accepted by ACL 2025!
- Mar. 2024 Two papers (DialogVCS, ALSACE) accepted by NAACL 2024!
- Jan. 2024 One paper (MMICL) accepted by ICLR 2025!
- May 2023 One paper (SANTA) accepted by ACL 2023!
- Jan. 2023 One paper (CTR) accepted by ICLR 2023!
Selected Projects
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration
Zefan Cai, Wen Xiao, Hanshi Sun, Cheng Luo, Yikai Zhang, Ke Wan, Yucheng Li, Yeyang Zhou, Li-Wen Chang, Jiuxiang Gu, Zhen Dong, Anima Anandkumar, Abedelkadir Asi, Junjie Hu
Preprint.
PDF Page Code Twitter Youtube
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Yucheng Li, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Junjie Hu, Wen Xiao
Preprint.
PDF Page Code Twitter Article 1 Article 2 Article 3 Article 4
All Projects
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration
Zefan Cai, Wen Xiao, Hanshi Sun, Cheng Luo, Yikai Zhang, Ke Wan, Yucheng Li, Yeyang Zhou, Li-Wen Chang, Jiuxiang Gu, Zhen Dong, Anima Anandkumar, Abedelkadir Asi, Junjie Hu
Preprint.
PDF Page Code Twitter
From Preferences to Prejudice: The Role of Alignment Tuning in Shaping Social Bias in Video Diffusion Models
Zefan Cai*, Haoyi Qiu*, Haozhe Zhao*, Ke Wan, Jiachen Li, Jiuxiang Gu, Wen Xiao, Nanyun Peng, Junjie Hu
Preprint.
PDF
MENTOR: Efficient Multimodal-Conditioned Tuning for Autoregressive Vision Generation Models
Haozhe Zhao*, Zefan Cai*, Shuzheng Si, Liang Chen, Jiuxiang Gu, Wen Xiao, Junjie Hu
Preprint.
PDF
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
Zeyi Huang, Yuyang Ji, Anirudh Sundara Rajan, Zefan Cai, Wen Xiao, Junjie Hu, Yong Jae Lee
Preprint.
PDF Page Code Twitter 1 Twitter 2 Twitter 3
Adaptivestep: Automatically dividing reasoning step through model confidence
Yuliang Liu, Junjie Lu, Zhaoling Chen, Chaofeng Qu, Jason Klein Liu, Chonghan Liu, Zefan Cai, Yunhui Xia, Li Zhao, Jiang Bian, Chuheng Zhang, Wei Shen, Zhouhan Lin
ICML 2025 (Poster)
PDF Code Model Data
Headinfer: Memory-efficient llm inference by head-wise offloading
Cheng Luo, Zefan Cai, Hanshi Sun, Jinqi Xiao, Bo Yuan, Wen Xiao, Junjie Hu, Jiawei Zhao, Beidi Chen, Anima Anandkumar
ICML 2025 Workshop on Long Context Foundation Models (LCFM)
PDF Code
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang, Shanglin Yang, Ming Jiang, Diyi Yang, Junjie Hu
ICLR 2025 (Poster)
PDF Code
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Liang Chen, Zekun Wang, Shuhuai Ren, Lei Li, Haozhe Zhao, Yunshui Li, Zefan Cai, Hongcheng Guo, Lei Zhang, Yizhe Xiong, Yichi Zhang, Ruoyu Wu, Qingxiu Dong, Ge Zhang, Jian Yang, Lingwei Meng, Shujie Hu, Yulong Chen, Junyang Lin, Shuai Bai, Andreas Vlachos, Xu Tan, Minjia Zhang, Wen Xiao, Aaron Yee, Tianyu Liu, Baobao Chang
Priprint.
PDF Repo
Not all heads matter: A head-level kv cache compression method with integrated retrieval and reasoning
Yu Fu, Zefan Cai, Abedelkadir Asi, Wayne Xiong, Yue Dong, Wen Xiao
ICLR 2025 (Poster)
PDF Code
A spark of vision-language intelligence: 2-dimensional autoregressive transformer for efficient finegrained image generation
Liang Chen, Sinan Tan, Zefan Cai, Weichu Xie, Haozhe Zhao, Yichi Zhang, Junyang Lin, Jinze Bai, Tianyu Liu, Baobao Chang
ICLR 2025 (Poster)
PDF Code Model
COMMA: A Communicative Multimodal Multi-Agent Benchmark
Timothy Ossowski, Jixuan Chen, Danyal Maqbool, Zefan Cai, Tyler Bradshaw, Junjie Hu
Preprint.
PDF
Omni-math: A universal olympiad level mathematic benchmark for large language models
Bofei Gao, Feifan Song, Zhe Yang, Zefan Cai, Yibo Miao, Qingxiu Dong, Lei Li, Chenghao Ma, Liang Chen, Runxin Xu, Zhengyang Tang, Benyou Wang, Daoguang Zan, Shanghaoran Quan, Ge Zhang, Lei Sha, Yichang Zhang, Xuancheng Ren, Tianyu Liu, Baobao Chang
ICLR 2025 (Poster)
PDF Page Repo Code Data Model
ML-bench: Large language models leverage open-source libraries for machine learning tasks
Yuliang Liu*, Xiangru Tang*, Zefan Cai*, Junjie Lu, Yichi Zhang, Yanjun Shao, Zexuan Deng, Helan Hu, Zengxian Yang, Kaikai An, Ruijun Huang, Shuzheng Si, Sheng Chen, Haozhe Zhao, Zhengliang Li, Liang Chen, Yiming Zong, Yan Wang, Tianyu Liu, Zhiwei Jiang, Baobao Chang, Yujia Qin, Wangchunshu Zhou, Yilun Zhao, Arman Cohan, Mark Gerstein
ICLR 2025 Deep Learning for Code (DL4C) Workshop
ICLR 2025 Agentic AI for Scientific Discovery Workshop
PDF Page Code Data
Towards a unified view of preference learning for large language models: A survey
Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Shanghaoran Quan, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu, Baobao Chang
Preprint.
PDF Repo Page
LLM critics help catch bugs in mathematics: Towards a better mathematical verifier with natural language feedback
Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang
ACL 2025 (Findings) - Long Paper
PDF Code
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Yucheng Li, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Junjie Hu, Wen Xiao
Preprint.
PDF Page Code Twitter Article 1 Article 2 Article 3 Article 4
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation
Haozhe Zhao*, Zefan Cai*, Shuzheng Si, Liang Chen, Yufeng He, Kaikai An, Baobao Chang
NAACL 2024 (Main) - Long Paper
PDF Code
Improving Event Definition Following For Zero-Shot Event Detection
Zefan Cai*, Po-Nien Kung*, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P Jeffrey Brantingham, Wei Wang, Nanyun Peng
ACL 2024 (Main) - Long Paper
PDF
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
Liang Chen, Yichi Zhang, Shuhuai Ren, Haozhe Zhao, Zefan Cai, Yuchi Wang, Peiyi Wang, Xiangdi Meng, Tianyu Liu, Baobao Chang
ACL 2024 (Main) - Long Paper
PDF Code Data
VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness
Rongyu Zhang*, Zefan Cai*, Huanrui Yang*, Zidong Liu, Denis A Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, Li Du, Shanghang Zhang
ACM Multimedia 2024
PDF
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao*, Zefan Cai*, Shuzheng Si*, Xiaojian Ma, Kaikai An, Liang Chen, Zixuan Liu, Sheng Wang, Wenjuan Han, Baobao Chang
ICLR 2024 (Poster)
PDF Code
Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning
Shuzheng Si, Helan Hu, Haozhe Zhao, Shuang Zeng, Kaikai An, Zefan Cai, Baobao Chang
ACL 2024 (Findings) - Long Paper
PDF
Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond
Liang Chen, Yichi Zhang, Shuhuai Ren, Haozhe Zhao, Zefan Cai, Yuchi Wang, Peiyi Wang, Tianyu Liu, Baobao Chang
NeurIPS 2023 Foundation Models for Decision Making (FMDM) Workshop
PDF Code Data
Human-in-the-loop through chain-of-thought
Zefan Cai, Baobao Chang, Wenjuan Han
Preprint.
PDF
Large language models are not fair evaluators
Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Qi Liu, Tianyu Liu, Zhifang Sui
ACL 2024 (Main) - Long Paper
PDF Code
DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade
Zefan Cai*, Xin Zheng*, Tianyu Liu*, Xu Wang, Haoran Meng, Jiaqi Han, Gang Yuan, Binghuai Lin, Baobao Chang, Yunbo Cao
NAACL 2024 (Main) - Long Paper
PDF
Diffcap: Exploring continuous diffusion on image captioning
Yufeng He*, Zefan Cai*, Xu Gan, Baobao Chang
Preprint.
PDF
SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition
Shuzheng Si*, Zefan Cai*, Shuang Zeng, Guoqiang Feng, Jiaxing Lin, Baobao Chang
ACL 2023 (Findings) - Long Paper
PDF
Compositional task representations for large language models
Nan Shao*, Zefan Cai*, Chonghua Liao, Yanan Zheng, Zhilin Yang
ICLR 2023 (Poster)
PDF Code