I am Qiaolin Yu, a graduate student at Cornell University and Cornell Tech. My interests primarily focus on machine learning infrastructure, database systems, and storage systems. Currently, I work as a research intern at Cornell Computer Systems Laboratory, advised by Udit Gupta. I also work closely with Zhichao Cao. I have three papers that are under revision, where I serve as the first author for one paper, co-first author for another paper, and third author for the third paper.

In conjunction with my academic endeavors, I have amassed significant industrial experience through a series of internships. I will join Databricks as a software engineer intern in 2024 summer. Previously, I have interned in the compute architecture team at PingCAP, the Azure Machine Learning team at Microsoft, the ByteHouse team at ByteDance, and the graph database team at Kuaishou.

  • Machine Learning Systems
  • Database Systems
  • Storage Systems
  • MS, Computer Science and Information Systems

    Cornell University (Cornell Tech), 2023.08 - 2025.05

  • BS, Computer Science

    University of Liverpool, 2019.09 - 2023.07

Industry Experience

SDE Intern, Cloud Database System
Aug 2022 – May 2023
Developed the auto-scaling feature for TiFlash.
SDE Intern, AI Infrastructure
May 2022 – Aug 2022
Built infrastructure for Azure Machine Learning service.
SDE Intern, Cloud Native Data Warehouse
Jan 2022 – Apr 2022
Worked on ByteHouse (also known as ByConity within the open-source community), a cloud native data warehouse.
SDE Intern, Graph Database System
Jun 2021 – Sep 2021
Designed and developed infrastructure for NebulaGraph clusters.

Research Experience

Computer Systems Laboratory, Cornell University
Research Intern, Machine Learning System
Nov 2023 – Present

Advised by Udit Gupta.

Optimizing data I/O performance in end-to-end recommendation systems.

Intelligent Data Infrastructure Lab, Arizona State University
Research Intern, Storage Engine
Dec 2021 – Present

Advised by Zhichao Cao.

Conducting research on RocksDB and its related database systems, such as MyRocks, NebulaGraph, and Kvrocks, with a focus on their performance.