# Chijun Sima — Full Context

> This file provides comprehensive information about Chijun Sima for LLM agents and AI systems. For a concise summary, see /llms.txt.

## Snapshot

- Last updated: 2026-03-15
- Website: https://www.chijunsima.com/
- Canonical description: Chijun Sima — ML systems researcher at Tencent (WeChat). Research on efficient AI systems, model freshness, recommender systems, and LLM serving. Co-first author, OSDI 2022 (Ekko). LLVM developer with commit access.

## Machine-readable endpoints

- llms.txt: https://www.chijunsima.com/llms.txt (text/plain)
- llms-full.txt: https://www.chijunsima.com/llms-full.txt (text/plain)
- profile.json: https://www.chijunsima.com/profile.json (application/json)
- publications.json: https://www.chijunsima.com/publications.json (application/json)
- feed.json: https://www.chijunsima.com/feed.json (application/feed+json)

## Identity

- Full name: Chijun Sima
- Current role: Senior Software Development Engineer — Efficient ML Systems
- Employer: Tencent (WeChat), Guangzhou, China
- Research focus: Efficient AI (MLSys): scalable and cost-effective training and serving; system-algorithm co-design.
- Current topics: recommender systems, LLM serving, model freshness, distributed model updates, KV-cache management

## Contact

- Email: simachijun@gmail.com
- Website: https://www.chijunsima.com/
- Google Scholar: https://scholar.google.com/citations?user=8-HD_IEAAAAJ&hl=en
- GitHub: https://github.com/NutshellySima
- LinkedIn: https://www.linkedin.com/in/chijun-sima/
- CV (PDF): https://www.chijunsima.com/cv.pdf

## News

- 2025: Reviewing for CVPR 2025, ICLR 2025 Workshop (FM-Wild), NeurIPS 2025 Workshop (Efficient Reasoning)
- 2022: Ekko published at OSDI 2022; invited talks at Tencent WeChat AI, DataFun, TechBeat
- 2022: Tencent Technology Breakthrough Award (Gold Prize) — Project Lead, Ekko

## Education

- B.Eng. in Computer Science and Technology (Innovation Class), South China University of Technology, Sep 2016 – Jun 2020
- GPA 3.85 / 4.00 · Rank 1 / 28

## Selected publications

### 1. Ekko: A Large-Scale Deep Learning Recommender System with Low-Latency Model Update

- Venue: 16th USENIX Symposium on Operating Systems Design and Implementation, 2022
- Authors: Chijun Sima*, Yao Fu*, Man-Kit Sit, Liyi Guo, Xuri Gong, Feng Lin, Junyu Wu, Yongsheng Li, Haidong Rong, Pierre-Louis Aublin, Luo Mai
- URL: https://www.usenix.org/conference/osdi22/presentation/sima
- Note: * co-first author. Supervised by Luo Mai.
- Summary: Low-latency model update system for multi-terabyte deep learning recommendation models, achieving 2.4s update latency, 10,000x model-size scaling, and large production impact in WeChat.

### 2. Dynamic Barycenter Averaging Kernel in RBF Networks for Time Series Classification

- Venue: IEEE Access, 2019
- Authors: Kejian Shi, Hongyang Qin, Chijun Sima, Sen Li, Lifeng Shen, Qianli Ma
- Note: 2019.
- Summary: Time-series classification work on dynamic barycenter averaging kernels in RBF networks.

## Industry and open-source experience

### Tencent (WeChat) — Senior Software Development Engineer
- Period: Jul 2020 – Present
- Focus: Efficient ML Systems
- Location: Guangzhou, China
- Project: Ekko: low-latency model update for multi-terabyte DLRMs (published in part as OSDI '22)
  - Problem. Scaling DLRMs improved offline accuracy but degraded online engagement; root cause: stale models from increased model-update latency.
  - Key idea. Co-designed deployment mechanisms with model-aware policies (compressed update dissemination, accuracy-aware scheduling, SLO-aware placement, safe rollback).
  - Technical contributions. WAN bandwidth −92 %, machine cost −49 %, 2.4 s model-update latency; 10,000× model-size scaling (GB → tens of TB).
  - Outcomes. Core techniques published as OSDI '22 (co-first author). Deployed in WeChat recommendation stacks, serves 1 B+ users daily. Official WeChat blog reports +40 % DAU and +87 % total VV over six months after full adoption (alongside product iteration and operations).
- Project: Data and feature platform: safe, scalable pipelines
  - Problem. Modern feature pipelines are long and increasingly multimodal; cross-process operator composition creates high overhead and expensive data movement.
  - Approach. WebAssembly-based runtime for in-process isolation (safety + resource constraints) and locality-aware operator placement near data sources.
  - Outcome. Data movement reduced up to 1,200× on representative workloads; widely used within WeChat for data preparation.
- Project: LLM serving systems
  - Building cost-effective serving mechanisms around remote KV-cache storage and compression.

### LLVM — Developer (commit access)
- Period: 2018 – Present
- Focus: Google Summer of Code 2018
  - Improved Semi-NCA performance and optimization pipeline; shipped in LLVM 9.0 (reported speedups up to 1,980× on real-world samples).
  - Unified APIs on dominator trees; shipped in LLVM 7.0.

## Academic service

- Reviewer: CVPR 2025
- Reviewer: ICLR 2025 Workshop on FM-Wild
- Reviewer: NeurIPS 2025 Workshop on Efficient Reasoning

## Awards

- Tencent Technology Breakthrough Award (Gold Prize) — Project Lead, Ekko (internal highest technical honor) (2022)
- Bronze Medal, ACM-ICPC Asia Xi'an Regional Contest (2017)
- Second Prize, 15th China Collegiate Programming Contest (Guangdong Division, out of 177 teams)

## Talks

- "Ekko: A Large-Scale Deep Learning Recommender System with Low-Latency Model Update"
  - Tencent WeChat AI Department, Shenzhen (Jun 2022)
  - DataFun, Virtual (Aug 2022)
  - TechBeat, Virtual (Sep 2022)

## Selected company and press write-ups about Ekko

- OSDI 2022 paper: https://www.usenix.org/conference/osdi22/presentation/sima
- WeChat official write-up: https://mp.weixin.qq.com/s/gBD3mdoRRlGI8bmXp2OBMA
- Tencent official write-up: https://mp.weixin.qq.com/s/hS5ZebOC7oQz_Itud0A_Rg
- Synced Review / JIQIZHIXIN: https://mp.weixin.qq.com/s/Vriupgqusj1zJmSuYU9WjA
- Google Scholar: https://scholar.google.com/citations?user=8-HD_IEAAAAJ&hl=en

## Keywords

Machine Learning Systems, MLSys, Efficient AI, Model Freshness, Recommender Systems, Deep Learning, Deep Learning Recommendation Models, LLM Serving, KV-Cache Management, WebAssembly, LLVM, Compilers, Distributed Systems