About Me

I am a Software engineer at Qualcomm AI Research. I received my Ph.D. from Seoul National University, working with the Software Platform Lab (SPLab) under the guidance of Prof. Byung-Gon Chun. I got a B.S. in Computer Science and Engineering from Seoul National University. I love building systems - specifically, systems for Machine Learning and Deep Learning. [cv]

Education

Seoul National University

Mar 2014 - Feb 2020
Ph.D. in Computer Science and Engineering

Seoul National University

Mar 2007 - Feb 2014
B.S. in Computer Science and Engineering

Work Experience

Staff Engineer

Dec 2021 - Present

Senior Engineer

Apr 2020 - Dec 2020
Qualcomm AI Research, Seoul, South Korea

Research Assistant, Software Platform Lab

Mar 2014 - Feb 2020
Seoul National University, Seoul, South Korea

I have studied and built large-scale data processing systems, recently focusing on building Machine Learning (ML) and Deep Learning (DL) inference systems. I have participated in the following projects:

  • Pretzel: A white box ML inference system for model pipelines. Unlike previous black box systems, Pretzel exploits knowledge of models obtained from the training phase. This knowledge enables Pretzel to optimize the end-to-end execution plan of models. Pretzel also considers multiple models to be served together and schedules CPU/memory resources more efficiently.
  • RnB (Replicate-and-Batch): A multi-GPU DL inference system for DNN pipelines. DNN pipelines often consist of sub-networks with heterogeneous performance characteristics (e.g., latency). To optimize the throughput and latency, RnB replicates each sub-network to a different number of GPUs and batches (or splits) inputs of each sub-network based on their sizes. RnB is built on PyTorch.
  • Splash: A DL system for collaborative inference in a distributed environment with embedded devices and a hub machine. Splash aims to run highly accurate (but heavy) DNN models on computationally weak devices with assistance of the strong hub. Splash applies compression techniques such as Tucker decomposition to reduce the inference latency. Splash currently supports TensorFlow and Caffe2.
  • ICCV 2019 AIA (AI Acceleration) Challenge: I participated in the AIA challenge, specifically in the DSP (Digital Signal Processor) track. I ran a DNN model on TFLite (with Android NNAPI) and accelerated the model by utilizing DSPs (e.g., NPU) with optimizations such as quantization.

Project Management Committee (PMC)

Sep 2014 - Present
Apache REEF

I am one of the committee members of the Apache REEF. I worked as the release manager of REEF v0.14.

Research Intern, Cloud Information Services Lab (CISL)

Jun 2017 - Sep 2017
Microsoft, Redmond, WA, USA

I investigated how to optimize ML inference systems with white box approaches. The research was published in ICCD, NIPS ML Systems workshop, SysML, OSDI, and IEEE Data Engineering bulletin.

Research Intern, Systems Research Group

Sep 2015 - Feb 2016
Microsoft Research Asia, Beijing, China

I participated in the Pado project, a data processing system for handling transient resources in data centers. This research was published in EuroSys.

Publications

  1. Automating System Configuration of Distributed Machine Learning
    Woo-Yeon Lee, Yunseong Lee, Joo Seong Jeong, Gyeong-In Yu, Joo Yeon Kim, Ho Jin Park, Beomyeol Jeon, Won Wook Song, Gunhee Kim, Markus Weimer, Brian Cho and Byung-Gon Chun
    IEEE ICDCS, Jul. 2019 [pdf] [bib]
  2. From the Edge to the Cloud: Model Serving in ML.NET
    Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Markus Weimer, Matteo Interlandi
    IEEE Bulletin of the Technical Committee on Data Engineering, Dec. 2018 [pdf] [bib]
  3. PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
    Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, Matteo Interlandi
    OSDI, Oct. 2018 [pdf] [bib]
  4. Towards High-Performance Prediction Serving Systems
    Yunseong Lee, Alberto Scolari, Matteo Interlandi, Markus Weimer, Byung-Gon Chun
    NIPS Machine Learning Systems Workshop, Dec. 2017 [pdf] [bib]
  5. Towards Accelerating Generic Machine Learning Prediction Pipelines
    Alberto Scolari, Yunseong Lee, Markus Weimer, Matteo Interlandi
    ICCD, Nov. 2017 [pdf] [bib]
  6. Pado: A Data Processing Engine for Harnessing Transient Resources in Datacenters
    Youngseok Yang, Geon-Woo Kim, Won Wook Song, Yunseong Lee, Andrew Chung, Zhengping Qian, Brian Cho, Byung-Gon Chun
    EuroSys, Apr. 2017 [pdf] [bib]
  7. Apache REEF: Retainable Evaluator Execution Framework
    Byung-Gon Chun, Tyson Condie, Yingda Chen, Brian Cho, Andrew Chung, Carlo Curino, Chris Douglas, Matteo Interlandi, Beomyeol Jeon, Joo Seong Jeong, Gye-Won Lee, Yunseong Lee, Tony Majestro, Dahlia Malkhi, Sergiy Matusevych, Brandon Myers, Mariia Mykhailova, Shravan Narayanamurthy, Joseph Noor, Raghu Ramakrishnan, Sriram Rao, Russell Sears, Beysim Sezgin, Tae-Geon Um, Julia Wang, Markus Weimer, Youngseok Yang.
    ACM Transactions on Computer Systems (TOCS) [pdf] [bib]
  8. Dolphin: Runtime Optimization for Distributed Machine Learning
    Byung-Gon Chun, Brian Cho, Beomyeol Jeon, Joo Seong Jeong, Gunhee Kim, Joo Yeon Kim, Woo-Yeon Lee, Yun Seong Lee, Markus Weimer, Gyeong-In Yu.
    ICML ML Sys ’16 workshop, June 2016 [pdf] [bib]
  9. REEF: Retainable Evaluator Execution Framework.
    Markus Weimer, Yingda Chen, Byung-Gon Chun, Tyson Condie, Carlo Curino, Chris Douglas, Yunseong Lee, Tony Majestro, Dahlia Malkhi , Sergiy Matusevych, Brandon Myers, Shravan Narayanamurthy, Raghu Ramakrishnan, Sriram Rao, Russell Sears, Beysim Sezgin, Julia Wang.
    ACM SIGMOD, June 2015. [pdf] [bib]
  10. Elastic Memory: Bring Elasticity Back To In-Memory Big Data Analytics.
    Joo Seong Jeong, Woo-Yeon Lee, Yunseong Lee, Youngseok Yang, Brian Cho, Byung-Gon Chun.
    HotOS 2015, May 2015. [pdf] [bib]