About Me
I’m a Ph.D candidate at Machine Learning and Intelligence Lab (MLILab) in KAIST, advised by Prof. Eunho Yang.
Generalization and Optimization in Deep Learning
My scholarly pursuit is centered on a fundamental comprehension of the generalization and optimization mechanisms of deep learning models through the lens of loss landscape, alongside the advancement of learning algorithms informed by deep theoretical insights. I have engaged in extensive studies concerning the scale-invariance of sharpness within the loss landscape, which is widely acknowledged as a proxy of the generalization capability of deep learning models. Furthermore, I have explored the efficacy of sharpness-aware minimization by analyzing its role in overcoming local optima and achieving convergence within asymmetrical valleys. Recently, my research has expanded to incorporate these foundational insights into contemporary frameworks, such as LoRA fine-tuning for compressed large language models and mixture of experts.
Large Language Models
My research within the domain of large language models bifurcates into two main themes: 1) Light-weigthening: This line of inquiry concentrates on reducing the model size or computational overhead of large language models to improve their operational efficiency. Currently, I am engaged in developing methodologies for knowledge distillation, sub 4-bit post-training quantization and speculative decoding. 2) Fine-tuning: My recent efforts are directed towards refining the fine-tuning processes for quantized models. The goal of this research is to devise a quantization algorithm tailored for large language models that dynamically adapts during fine-tuning, thus transcending the limitations of traditional data-free or generic corpus-based quantization approaches.
Published Papers
-
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sung-Yub Kim, Sihwan Park, Kyungsu Kim, Eunho Yang
ICLR, 2023 (Spotlighted) -
On the Understanding of Sharpness-aware Minimization and its Application: A Perspective on Escape Efficiency and Asymmetric Valley
Sihwan Park advised by Eunho Yang
Master’s Thesis, KAIST, 2022
Preprints
-
Bias Decay Matters: Improving Large Batch Optimization with Connectivity Sharpness
Sung-Yub Kim, Sihwan Park, Yong-Deok Kim, Eunho Yang
Experiences
- Research Intern, Computer Architecture and Systems Lab, KAIST, Daejeon, Aug. 2019 - Dec. 2019
- Advisor : Prof. Jaehyuk Huh
- Low-level security techniques of Intel SGX and secure container with KVSSD
- Research Intern, Collaborative Distributed Systems and Networking Lab, KAIST, Daejeon, Jan. 2018 - Oct. 2018
- Advisor : Prof. Dongman Lee
- Signal data processing for IoT task recognition and framework for task segmentation
- Exchange Student, University of California, Santa Cruz, Santa Cruz, CA, Jun. 2019 - Aug. 2019
- Software engineering and computer game basics
Education
-
Ph.D. in Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST)
Sep. 2022 - Present -
M.S. in Graduate School of AI, Korea Advanced Institute of Science and Technology (KAIST)
Sep. 2020 - Aug. 2022 -
B.S. in Computer Science, Korea Advanced Institute of Science and Technology (KAIST)
Mar. 2015 - Aug. 2020 -
B.S. in Mathematical Science, Korea Advanced Institute of Science and Technology (KAIST)
Mar. 2015 - Aug. 2020
Projects
-
Sub-task generation based point/regional Out-Of-Distribution detection
Samsung Electronics, Mar.2021-Sep.2025 -
Predicting graph properties with few labels using Graph Neural Networks
Samsung Electronics, Mar.2021-Sep.2025 -
A Study on Statistically and Computationally Efficient Parameter Structures for Machine Learning Algorithms
National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2021-Dec.2022 -
A Study on Optimization and Network Interpretation Method for Large-Scale Machine Learning
National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2023-Feb.2027 -
A Study on Conversational Large Language Models for Virtual Physicians in Patient Intake
AITRICS, Apr.2024-May.2024