# Chijun Sima — Full Context > This file provides comprehensive information about Chijun Sima for LLM agents and AI systems. For a concise summary, see /llms.txt. ## Snapshot - Last updated: 2026-03-15 - Website: https://www.chijunsima.com/ - Canonical description: Chijun Sima — ML systems researcher at Tencent (WeChat). Research on efficient AI systems, model freshness, recommender systems, and LLM serving. Co-first author, OSDI 2022 (Ekko). LLVM developer with commit access. ## Machine-readable endpoints - llms.txt: https://www.chijunsima.com/llms.txt (text/plain) - llms-full.txt: https://www.chijunsima.com/llms-full.txt (text/plain) - profile.json: https://www.chijunsima.com/profile.json (application/json) - publications.json: https://www.chijunsima.com/publications.json (application/json) - feed.json: https://www.chijunsima.com/feed.json (application/feed+json) ## Identity - Full name: Chijun Sima - Current role: Senior Software Development Engineer — Efficient ML Systems - Employer: Tencent (WeChat), Guangzhou, China - Research focus: Efficient AI (MLSys): scalable and cost-effective training and serving; system-algorithm co-design. - Current topics: recommender systems, LLM serving, model freshness, distributed model updates, KV-cache management ## Contact - Email: simachijun@gmail.com - Website: https://www.chijunsima.com/ - Google Scholar: https://scholar.google.com/citations?user=8-HD_IEAAAAJ&hl=en - GitHub: https://github.com/NutshellySima - LinkedIn: https://www.linkedin.com/in/chijun-sima/ - CV (PDF): https://www.chijunsima.com/cv.pdf ## News - 2025: Reviewing for CVPR 2025, ICLR 2025 Workshop (FM-Wild), NeurIPS 2025 Workshop (Efficient Reasoning) - 2022: Ekko published at OSDI 2022; invited talks at Tencent WeChat AI, DataFun, TechBeat - 2022: Tencent Technology Breakthrough Award (Gold Prize) — Project Lead, Ekko ## Education - B.Eng. in Computer Science and Technology (Innovation Class), South China University of Technology, Sep 2016 – Jun 2020 - GPA 3.85 / 4.00 · Rank 1 / 28 ## Selected publications ### 1. Ekko: A Large-Scale Deep Learning Recommender System with Low-Latency Model Update - Venue: 16th USENIX Symposium on Operating Systems Design and Implementation, 2022 - Authors: Chijun Sima*, Yao Fu*, Man-Kit Sit, Liyi Guo, Xuri Gong, Feng Lin, Junyu Wu, Yongsheng Li, Haidong Rong, Pierre-Louis Aublin, Luo Mai - URL: https://www.usenix.org/conference/osdi22/presentation/sima - Note: * co-first author. Supervised by Luo Mai. - Summary: Low-latency model update system for multi-terabyte deep learning recommendation models, achieving 2.4s update latency, 10,000x model-size scaling, and large production impact in WeChat. ### 2. Dynamic Barycenter Averaging Kernel in RBF Networks for Time Series Classification - Venue: IEEE Access, 2019 - Authors: Kejian Shi, Hongyang Qin, Chijun Sima, Sen Li, Lifeng Shen, Qianli Ma - Note: 2019. - Summary: Time-series classification work on dynamic barycenter averaging kernels in RBF networks. ## Industry and open-source experience ### Tencent (WeChat) — Senior Software Development Engineer - Period: Jul 2020 – Present - Focus: Efficient ML Systems - Location: Guangzhou, China - Project: Ekko: low-latency model update for multi-terabyte DLRMs (published in part as OSDI '22) - Problem. Scaling DLRMs improved offline accuracy but degraded online engagement; root cause: stale models from increased model-update latency. - Key idea. Co-designed deployment mechanisms with model-aware policies (compressed update dissemination, accuracy-aware scheduling, SLO-aware placement, safe rollback). - Technical contributions. WAN bandwidth −92 %, machine cost −49 %, 2.4 s model-update latency; 10,000× model-size scaling (GB → tens of TB). - Outcomes. Core techniques published as OSDI '22 (co-first author). Deployed in WeChat recommendation stacks, serves 1 B+ users daily. Official WeChat blog reports +40 % DAU and +87 % total VV over six months after full adoption (alongside product iteration and operations). - Project: Data and feature platform: safe, scalable pipelines - Problem. Modern feature pipelines are long and increasingly multimodal; cross-process operator composition creates high overhead and expensive data movement. - Approach. WebAssembly-based runtime for in-process isolation (safety + resource constraints) and locality-aware operator placement near data sources. - Outcome. Data movement reduced up to 1,200× on representative workloads; widely used within WeChat for data preparation. - Project: LLM serving systems - Building cost-effective serving mechanisms around remote KV-cache storage and compression. ### LLVM — Developer (commit access) - Period: 2018 – Present - Focus: Google Summer of Code 2018 - Improved Semi-NCA performance and optimization pipeline; shipped in LLVM 9.0 (reported speedups up to 1,980× on real-world samples). - Unified APIs on dominator trees; shipped in LLVM 7.0. ## Academic service - Reviewer: CVPR 2025 - Reviewer: ICLR 2025 Workshop on FM-Wild - Reviewer: NeurIPS 2025 Workshop on Efficient Reasoning ## Awards - Tencent Technology Breakthrough Award (Gold Prize) — Project Lead, Ekko (internal highest technical honor) (2022) - Bronze Medal, ACM-ICPC Asia Xi'an Regional Contest (2017) - Second Prize, 15th China Collegiate Programming Contest (Guangdong Division, out of 177 teams) ## Talks - "Ekko: A Large-Scale Deep Learning Recommender System with Low-Latency Model Update" - Tencent WeChat AI Department, Shenzhen (Jun 2022) - DataFun, Virtual (Aug 2022) - TechBeat, Virtual (Sep 2022) ## Selected company and press write-ups about Ekko - OSDI 2022 paper: https://www.usenix.org/conference/osdi22/presentation/sima - WeChat official write-up: https://mp.weixin.qq.com/s/gBD3mdoRRlGI8bmXp2OBMA - Tencent official write-up: https://mp.weixin.qq.com/s/hS5ZebOC7oQz_Itud0A_Rg - Synced Review / JIQIZHIXIN: https://mp.weixin.qq.com/s/Vriupgqusj1zJmSuYU9WjA - Google Scholar: https://scholar.google.com/citations?user=8-HD_IEAAAAJ&hl=en ## Keywords Machine Learning Systems, MLSys, Efficient AI, Model Freshness, Recommender Systems, Deep Learning, Deep Learning Recommendation Models, LLM Serving, KV-Cache Management, WebAssembly, LLVM, Compilers, Distributed Systems