See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

Jaehyun Park, Minyoung Ahn , Minkyu Kim , Jonghyun Lee, Jae-Gil Lee, Dongmin Park

Multi-modal Learning CVPR 2026

See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

Minkyu Kim, Sangheon Lee, Dongmin Park

Multi-modal Learning ICLR 2026

VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts

Taebaek Hwang, Minseo Kim, Gisang Lee, Seonuk Kim, Hyunjun Eun

Multi-modal Learning EMNLP 2025

KRETA: A Benchmark for Korean Reading and Reasoning in Text-Rich VQA Attuned to Diverse Visual Contexts

Mini-Batch Optimization of Contrastive Loss

Jaewoong Cho, Kartik Sreenivasan, Keon Lee, Kyunghoo Mun, Soheun Yi, Jeong-Gwan Lee, Anna Lee, Jy-yong Sohn, Dimitris Papailiopoulos, Kangwook Lee

TMLR 2024 TMLR 2024

Mini-Batch Optimization of Contrastive Loss

S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions

Sangwoo MoMinkyu Kim, Kyungmin Lee, Jinwoo Shin

Multi-modal Learning NeurIPS 2023

S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions