About Me
I’m a Ph.D candidate at Machine Learning and Intelligence Lab (MLILab) in KAIST, advised by Prof. Eunho Yang.
Efficient Training and Inference for Foundations Models
My research focuses on enhancing the efficiency of foundation models, including large language models and mixed-modal architectures that combine text and images. I aim to improve operational efficiency by optimizing model size, minimizing computational overhead, and accelerating training and inference using techniques such as forward-only optimization, low-precision methods, and speculative decoding. The goal is to develop scalable frameworks that support the broad application of foundation models across diverse modalities.
Generalization and Optimization in Deep Learning
My research focuses on understanding the generalization and optimization mechanisms of deep learning models, particularly through the study of loss landscapes and advancing learning algorithms grounded in theoretical insights. I have conducted extensive research on the scale-invariance of sharpness in loss landscapes, a widely recognized proxy for the generalization capability of deep learning models. Additionally, I have investigated the effectiveness of sharpness-aware minimization in overcoming local optima and achieving convergence within asymmetrical valleys. Recently, my work has expanded to apply these foundational insights to modern frameworks, including fine-tuning compressed models, zeroth-order optimization, and low-precision training for foundation models.
Published Papers (Last Updated: Mar.27, 2025)
-
LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
Sihwan Park†, Doohyuk Jang†, Sung-Yub Kim, Souvik Kundu, Eunho Yang (†: Equal Contribution)
ICLR 2025 Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models (Oral Presentation) -
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Doohyuk Jang†, Sihwan Park†, June Yong Yang, Yeonsung Jung, Jihun Yun, Souvik Kundu, Sung-Yub Kim, Eunho Yang (†: Equal Contribution)
ICLR 2025 -
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sung-Yub Kim, Sihwan Park, Kyungsu Kim, Eunho Yang
ICLR 2023 (Spotlighted)
Preprints
-
Bias Decay Matters: Improving Large Batch Optimization with Connectivity Sharpness
Sung-Yub Kim, Sihwan Park, Yong-Deok Kim, Eunho Yang (2022) -
MeZO-A^3dam: Memory-efficient Zeroth-order Adam with Adaptivity Adjustments for Fine-tuning LLMs
Sihwan Park†, Jihun Yun†, Sung-Yub Kim, June Yong Yang, Yeonsung Jung, Souvik Kundu, Kyungsu Kim, Eunho Yang (†: Equal Contribution) (2024) -
Unraveling Zeroth-Order Optimization Through the Lens of Low-Dimensional Structured Perturbations
Sihwan Park†, Jihun Yun†, Sung-Yub Kim, Souvik Kundu, Eunho Yang (†: Equal Contribution) (2025)
Education
-
Ph.D. in Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST)
Sep. 2022 - Present - M.S. in Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST)
Sep. 2020 - Aug. 2022 -
B.S. in Computer Science, Korea Advanced Institute of Science and Technology (KAIST)
Mar. 2015 - Aug. 2020 - B.S. in Mathematical Science, Korea Advanced Institute of Science and Technology (KAIST)
Mar. 2015 - Aug. 2020
Experiences
- Research Intern, Computer Architecture and Systems Lab, KAIST, Daejeon, Aug. 2019 - Dec. 2019
- Advisor : Prof. Jaehyuk Huh
- Low-level security techniques of Intel SGX and secure container with KVSSD
- Research Intern, Collaborative Distributed Systems and Networking Lab, KAIST, Daejeon, Jan. 2018 - Oct. 2018
- Advisor : Prof. Dongman Lee
- Signal data processing for IoT task recognition and framework for task segmentation
- Exchange Student, University of California, Santa Cruz, Santa Cruz, CA, Jun. 2019 - Aug. 2019
- Software engineering and computer game basics
Projects
-
Sub-task generation based point/regional Out-Of-Distribution detection
Samsung Electronics, Mar.2021-Sep.2025 -
Predicting graph properties with few labels using Graph Neural Networks
Samsung Electronics, Mar.2021-Sep.2025 -
A Study on Statistically and Computationally Efficient Parameter Structures for Machine Learning Algorithms
National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2021-Dec.2022 -
A Study on Optimization and Network Interpretation Method for Large-Scale Machine Learning
National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2023-Feb.2027 -
A Study on Conversational Large Language Models for Virtual Physicians in Patient Intake
AITRICS, Apr.2024-May.2024 -
Efficient Foundation Models on Intel Systems
Intel Corporation & NAVER, Sep.2024-Aug.2027