Here is Peng Lai (赖鹏). I am currently a second-year M.Phil. student in the Department of Statistics and Data Science at Southern University of Science and Technology, enrolled in an integrated M.Phil.–Ph.D. program (Mathematics → Computer Science), and will formally continue as a Ph.D. student under the supervision of Prof. Guanhua Chen starting in Fall 2026. I am currently doing a research internship at Alibaba Cloud.
My current research focuses on LLM-as-a-judge, reward model, and reinforcement learning. I aim to enhance model evaluation capabilities, enabling models to understand their own limitations and progressively improve—advancing the paradigm of self-reflective and continuously improving language models.
I am interested in popular and exciting research directions, and I am willing to collaborate with outstanding researchers. You are welcome to reach out to me anytime to discuss related research~
🔥 News
- 2026.01: Two papers (AlignScal and UniRRM) were submitted to ICML 2026 and are currently under review.
- 2026.01: 🎉🎉 Two papers — “BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation” and “Anchored Supervised Fine-Tuning” — were accepted to ICLR 2026 (Poster).
- 2025.09: 🎉🎉 Our paper “Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations” was accepted to NeurIPS 2025 (Poster).
📝 Publications
( * indicates equal contribution.)

BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation
Peng Lai*, Zhihao Ou*, Yong Wang, Longyue Wang, Jian Yang, Yun Chen, Guanhua Chen
- Accepted to ICLR 2026 (Poster).

Anchored Supervised Fine-Tuning
He Zhu*, Junyou Su*, Peng Lai*, Ren Ma, Wenjia Zhang, Linyi Yang, Guanhua Chen
- Accepted to ICLR 2026 (Poster).

Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations
Peng Lai, Jianjie Zheng, Sijie Cheng, Yun Chen, Peng Li, Yang Liu, Guanhua Chen
- Accepted to NeurIPS 2025 (Poster).
- AlignScal: Enhancing Preference Alignment via Data Selection Using Model Internal Signals
- Peng Lai*, He Zhu*, Zhiwen Ruan, Dongdong Zhang, Yun Chen, Peng Li, Furu Wei, Yang Liu, Guanhua Chen
- Under review at ICML 2026
- UniRRM: Unified Reasoning Reward Models Across Languages and Evaluation Paradigms
- Peng Lai, Yichao Du, Junchao Wu, Weibo Gao, Linan Yue, Longyue Wang, Weihua Luo, Derek F. Wong, Guanhua Chen
- Under review at ICML 2026
🎖 Honors and Awards
- 2025: Graduate Academic Scholarship (Special Class), Southern University of Science and Technology (Top 20%)
- 2022: Sichuan Provincial First Prize, National College Student Market Research and Analysis Competition (Team Leader)
- 2021: Sichuan Provincial First Prize, National College Student Mathematical Modeling Competition (Team Leader)
💻 Internships
- 2026.01 – Present: Research Intern, Alibaba Cloud
- 2025.10 – 2026.01: Research Intern, Alibaba International Digital Commerce
📖 Educations
- 2024 – Present: Integrated M.Phil.–Ph.D. program (Mathematics → Computer Science), Department of Statistics and Data Science, Southern University of Science and Technology (Ph.D. student status from Fall 2026).
- 2020 – 2024: B.Sc. in Statistics, School of Mathematical Sciences, Sichuan Normal University
📚 Teaching Experience
served as a Teaching Assistant at Southern University of Science and Technology.
- Advanced Natural Language Processing (Graduate), Fall 2025
- Probability Theory and Mathematical Statistics (Undergraduate), Spring 2025
- Engineering Probability and Statistics (Undergraduate), Fall 2024
🤝 Academic Services
- ICLR 2026 Workshop LLA Reviewer
- ACL ARR 2025 October Reviewer