Filter

RoDAC: A Robust Data-centric Anti-Cheat Framework for Fair Online Competitive Gaming

RoDAC: A Robust Data-centric Anti-Cheat Framework for Fair Online Competitive Gaming

Minsu Kim, Junwoo Park, Chanho Lee, Gibum Seo, Steven Euijong Whang, Hyuck Lee

Data-centric AI KDD 2026
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

Donghu Kim, Youngdo Lee, Minho Park, Kinam Kim, I Made Aswin Nahendra, Takuma Seno, Sehee Min, Daniel Palenicek, Florian Vogt, Danica Kragic, Jan Peters, Jaegul Choo, Hojoon Lee

Reinforcement Learning RSS 2026
Identifiable Token Correspondence for World Models

Identifiable Token Correspondence for World Models

Youngin Kim, Ray Sun, Inho Kim, Bumsoo Park, Hyun Oh Song

Language Model ICML 2026
Convex Distance Operator Transport: Convex and Geometry-Preserving Formulation

Convex Distance Operator Transport: Convex and Geometry-Preserving Formulation

Junhyoung Chung, Euijong Song, Won Hwa Kim, Gunwoong Park

Theoretical ICML 2026
How to Correctly Report LLM-as-a-Judge Evaluations

How to Correctly Report LLM-as-a-Judge Evaluations

Chungpa Lee, Thomas Zeng, Jongwon Jeong, Jy-yong Sohn, Kangwook Lee

Language Model ICML 2026
ReJump: A Tree-Jump Representation for Analyzing and Improving LLM Reasoning

ReJump: A Tree-Jump Representation for Analyzing and Improving LLM Reasoning

Yuchen Zeng, Shuibai Zhang, Wonjun Kang, Shutong Wu, Lynnix Zou, Ying Fan, Heeju Kim, Ziqian Lin, Jungtaek Kim, Hyung Il Koo, Dimitris Papailiopoulos, Kangwook Lee

Language Model ICML 2026
Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

Chungpa Lee, Jy-yong Sohn, Kangwook Lee

Theoretical ICML 2026
Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling

Seojeong Park*, Jiho Choi*, Junyong Kang, Seonho Lee, Jaeyo Shin, Hyunjung Shim

Language Model ICML 2026
Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Ruizhe Shi*, Minhak Song*, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon S. Du

Language Model ICML 2026
Coverage Improvement and Fast Convergence of On-policy Preference Learning

Coverage Improvement and Fast Convergence of On-policy Preference Learning

Juno Kim, Jihun Yun, Jason D. Lee, Kwang-Sung Jun

Language Model ICML 2026