Publications by Category

Arxiv

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

👥 Hyunji Jung*, Hanseul Cho*, and Chulhee Yun

📰 ICLR 2025 🗓 January 23, 2025 🔗 [paper] [arxiv] 👀

🎉 Best Paper Award at the 11th Joint Conference of Korean Artificial Intelligence Association (JKAIA 2024)

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count

👥 Hanseul Cho*, Jaeyoung Cha*, Srinadh Bhojanapalli, and Chulhee Yun

📰 ICLR 2025 🗓 January 23, 2025 🔗 [paper] [arxiv] [code] 👀

ICLR

SGDA with Shuffling: Faster Convergence For Nonconvex-PŁ Minimax Optimization

👥 Hanseul Cho and Chulhee Yun

📰 ICLR 2023 🗓 February 02, 2023 🔗 [paper] [arxiv] 👀

🎉 NAVER Outstanding Theory Paper Award at the 7th Joint Conference of Korea Artificial Intelligence Association (JKAIA 2022)

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

👥 Hyunji Jung*, Hanseul Cho*, and Chulhee Yun

📰 ICLR 2025 🗓 January 23, 2025 🔗 [paper] [arxiv] 👀

🎉 Best Paper Award at the 11th Joint Conference of Korean Artificial Intelligence Association (JKAIA 2024)

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count

👥 Hanseul Cho*, Jaeyoung Cha*, Srinadh Bhojanapalli, and Chulhee Yun

📰 ICLR 2025 🗓 January 23, 2025 🔗 [paper] [arxiv] [code] 👀

ICLR Workshop

Fundamental Benefit of Alternating Updates in Minimax Optimization

👥 Jaewook Lee*, Hanseul Cho*, and Chulhee Yun

📰 ICML 2024 (Short version at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning (BGPT)) 🗓 May 02, 2024 🔗 [paper] [arxiv] [code] 👀

🎉 Spotlight @ ICML 2024 (Top 3.5% among total submissions)

ICML

Fundamental Benefit of Alternating Updates in Minimax Optimization

👥 Jaewook Lee*, Hanseul Cho*, and Chulhee Yun

📰 ICML 2024 (Short version at ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning (BGPT)) 🗓 May 02, 2024 🔗 [paper] [arxiv] [code] 👀

🎉 Spotlight @ ICML 2024 (Top 3.5% among total submissions)

ICML Workshop

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

👥 Baekrok Shin*, Junsoo Oh*, Hanseul Cho, and Chulhee Yun

📰 NeurIPS 2024 (Short version at ICML 2024 Workshop on Advancing Neural Network Training (WANT)) 🗓 September 26, 2024 🔗 [paper] [arxiv] [code] 👀

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure

👥 Hanseul Cho*, Jaeyoung Cha*, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, and Chulhee Yun

📰 NeurIPS 2024 (Short version at ICML 2024 Workshop on Long-Context Foundation Models (LCFM)) 🗓 September 26, 2024 🔗 [paper] [arxiv] [code] 👀

Journal

Deep Model-Based Optimization of Jamming Effectiveness under Aircraft AESA Radar Operational Environment

👥 Hanseul Cho, Baekrok Shin, Chaewon Moon, Sang-Geun Hong, U-Ju Byeon, Jin-Yong Sung, and Chulhee Yun

📰 J-KICS 🗓 November 15, 2025 👀

KAIA

SGDA with Shuffling: Faster Convergence For Nonconvex-PŁ Minimax Optimization

👥 Hanseul Cho and Chulhee Yun

📰 ICLR 2023 🗓 February 02, 2023 🔗 [paper] [arxiv] 👀

🎉 NAVER Outstanding Theory Paper Award at the 7th Joint Conference of Korea Artificial Intelligence Association (JKAIA 2022)

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

👥 Hyunji Jung*, Hanseul Cho*, and Chulhee Yun

📰 ICLR 2025 🗓 January 23, 2025 🔗 [paper] [arxiv] 👀

🎉 Best Paper Award at the 11th Joint Conference of Korean Artificial Intelligence Association (JKAIA 2024)

arXiv

The Coverage Principle: A Framework for Understanding Compositional Generalization

👥 Hoyeon Chang*, Jinho Park*, Hanseul Cho, Sohee Yang, Miyoung Ko, Hyeonbin Hwang, Seungpil Won, Dohaeng Lee, Youbin Ahn, and Minjoon Seo

📰 Under Review 🗓 May 26, 2025 🔗 [arxiv] 👀

Hanseul Cho

Publications by Category

Arxiv

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count

ICLR

SGDA with Shuffling: Faster Convergence For Nonconvex-PŁ Minimax Optimization

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count

ICLR Workshop

Fundamental Benefit of Alternating Updates in Minimax Optimization

ICML

Fundamental Benefit of Alternating Updates in Minimax Optimization

ICML Workshop

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure

Journal

Deep Model-Based Optimization of Jamming Effectiveness under Aircraft AESA Radar Operational Environment

KAIA

SGDA with Shuffling: Faster Convergence For Nonconvex-PŁ Minimax Optimization

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

NeurIPS

PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure

arXiv

The Coverage Principle: A Framework for Understanding Compositional Generalization