Yu Li
George Washington University, Washington, D.C. 2025.9–2029.5(expected)
Wuhan University, Hongyi Honor College, China 2021.9-2025.5
![]()
I am currently a first year Ph.D. candidate at GWU supervised by Prof. Tian Lan and work with Prof.Zhengling Qi.
Research Topics: LLM Post-Training • Agent Policy Learning • Generative AI
- 📄 CV
- 🧪 GitHub
- 🎓 Google Scholar
News
- [03/2026] InsPO is accepted to AI with Recursive Self-Improvement@ICLR 2026 🎉. See you in Rio!
- [02/2026] CRAFT-LORA is accepted to CVPR 2026 🎉. See you in Denver.
- [02/2026] ACDZero is accepted to ICCN@INFOCOM2026 🎉. See you in Tokyo.
- [01/2026] I passed my PhD qualifying exam in my first semester and am now a PhD candidate 🎓.
- [01/2026] KG-SAM is accepted to ICASSP 2026 as an Oral Paper 🎉.
- [11/2025] SoRA is accepted to AAAI 2026 🎉.
Publications
Preprint / Under Review
ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning
· Code
· Code
Hierarchical RL framework with intrinsic skill evolution for scalable agent reasoning.
Hierarchical RLAgentSkill Learning
Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO
· Code
· Code
Bilateral context conditioning and reward-confidence correction to improve GRPO training.
GRPORLHFPost-training
InsPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization
· Paper · Code
· Paper · Code
Preference optimization that leverages intrinsic self-reflection signals in pairwise data to improve LLM alignment.
DPOSimPOPreference Learning
Reason in Chains, Learn in Trees: Self-Rectification and Grafting for Multi-turn Agent Policy Optimization
Self-rectification and tree-based grafting for multi-turn agent policy optimization.
AgentMulti-turnPolicy Optimization
MultiRefine-V: Multi-Turn Reinforcement Learning for Enhancing Verilog Code Synthesis
Multi-turn RL for enhancing Verilog code synthesis quality.
RLVRCode GenerationVerilog
Conferences
CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion
· Paper
· Paper
Rank-constrained LoRA adaptation for content–style personalization in image generation.
Generative AIPersonalizationLoRA
ACDZero: MCTS Agent for Mastering Automated Cyber Defense
Graph-embedding-guided MCTS planning for sample-efficient automated cyber defense.
Cyber DefenseMCTSGNN
KG-SAM: Injecting Anatomical Knowledge into Segment Anything Models via Conditional Random Fields
Knowledge-guided SAM with anatomical priors and CRF refinement for robust medical image segmentation.
Medical SegmentationSAMKnowledge GraphCRF
Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT
A unified calibration+rotation weight-conditioning framework that improves PEFT performance and efficiency.
LLMs PEFTWeight Conditioning
Journals
Dual branch SAM-Transformer Fusion Network for Accurate Breast Ultrasound Image Segmentation
Dual-branch SAM–Transformer fusion for accurate breast ultrasound image segmentation.
Ultrasound SegmentationSAMTransformer
SfMDiffusion: Self-supervised Monocular Depth Estimation in Endoscopy Based on Diffusion Models
Self-supervised monocular depth estimation for endoscopy using diffusion models with teacher-guided distillation.
Depth EstimationDiffusion ModelDistillation
Experiences
Mobile Intelligence Lab, George Washington University
Artificial General Intelligence Lab, Westlake University
Cyber-Physical Systems Lab, UC Irvine
Honors and Awards
- Innova International Exchange Scholarship, Wuhan University, 2024
- Innova Excellence Scholarship (Top 3%), Wuhan University, 2023, 2024
- Academic Excellence Scholarship (Top 5%), Hongyi Honor College, 2022, 2023, 2024
- First-Class Scholarship (Top 5%), Wuhan University, 2022, 2023, 2024
- Patent: Energy-saving calculation method, CN116085952.
Academic Services
- Conference Reviewer: ICML’26, CVPR’26, ICLR’26, AAAI’26, ICASSP’26
- Journal Reviewer: Neurocomputing, Frontiers in Oncology, IEEE Transactions on Networking, Frontiers in Medicine
This website was stolen from my best friend CD.