I’m a Ph.D candidate at Machine Learning and Intelligence Lab (MLILab) in KAIST, advised by Prof. Eunho Yang.

Research Interests

My primary research interest lies in improving the computational and memory efficiency of training and inference in foundation models, with a particular focus on large language models (LLMs). Recently, I proposed a relaxed speculative decoding method designed to accelerate autoregressive generation in multimodal models. Currently, I am investigating the potential of speculative decoding to accelerate LLM inference in large-batch settings, challenging the prevailing assumption that large-batch inference is inherently compute-bound and cannot benefit from speculative approaches. Moving forward, I plan to expand my research beyond speculative decoding, exploring various techniques such as parallel decoding, KV cache compression, quantization, and their algorithmic integration to enhance efficiency further. On the optimization front, my recent work includes zeroth-order optimization methods for memory-efficient fine-tuning of large-scale models. I also intend to develop efficient algorithms for accelerating reinforcement learning pipelines and effective fine-tuning methods explicitly tailored for quantized LLMs.

Background and Insights

In the earlier stages of my research, I focused on theoretical aspects of generalization and optimization dynamics in deep neural networks. Specifically, I conducted sharpness-based analyses of loss landscapes, provided theoretical insights into generalization, and proposed improved sharpness-aware minimization algorithms. These foundational insights continue to inform my current research, offering a principled foundation for designing effective and efficient methods for both training and inference in modern foundation models.

Publications (Last Updated: Jul 2025)

Education

Experiences

  • Research Intern, Computer Architecture and Systems Lab, KAIST, Daejeon, Aug. 2019 - Dec. 2019
    • Advisor : Prof. Jaehyuk Huh
    • Low-level security techniques of Intel SGX and secure container with KVSSD
  • Research Intern, Collaborative Distributed Systems and Networking Lab, KAIST, Daejeon, Jan. 2018 - Oct. 2018
    • Advisor : Prof. Dongman Lee
    • Signal data processing for IoT task recognition and framework for task segmentation
  • Exchange Student, University of California, Santa Cruz, Santa Cruz, CA, Jun. 2019 - Aug. 2019
    • Software engineering and computer game basics

Projects

  • Sub-task generation based point/regional Out-Of-Distribution detection
    Samsung Electronics, Mar.2021-Sep.2025

  • Predicting graph properties with few labels using Graph Neural Networks
    Samsung Electronics, Mar.2021-Sep.2025

  • A Study on Statistically and Computationally Efficient Parameter Structures for Machine Learning Algorithms
    National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2021-Dec.2022

  • A Study on Optimization and Network Interpretation Method for Large-Scale Machine Learning
    National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT), Mar.2023-Feb.2027

  • A Study on Conversational Large Language Models for Virtual Physicians in Patient Intake
    AITRICS, Apr.2024-May.2024

  • Efficient Foundation Models on Intel Systems
    Intel Corporation & NAVER, Sep.2024-Aug.2025