Update dissemination and scheduling
Compressed update dissemination plus accuracy-aware scheduling using model and gradient signals.
WAN bandwidth reduced by 92%.
Technical notes
Architecture details for Ekko, plus timeline context, service, talks, awards, and supporting references.
Scaling DLRMs improved offline quality but hurt online performance when model updates became stale. Ekko addresses this with model-aware update dissemination, optimization-based placement, and safe rollback mechanisms.
Compressed update dissemination plus accuracy-aware scheduling using model and gradient signals.
WAN bandwidth reduced by 92%.
Optimization-based shard management to co-locate models while protecting inference engine SLOs.
Machine cost reduced by 49%.
Model-state manager to detect harmful updates and roll back in seconds.
Production safety with second-level recovery.
Experience context from CV, including ongoing systems work in recommendation, data pipelines, and LLM serving.
Jul 2020 - Present
2018 - Present
Sep 2016 - Jun 2020
External references related to Ekko and publication records.
Update available
Refresh to load the latest version.