Colocate Hadoop YARN with Kubernetes to Save Massive Costs on Big Data

May 01, 2023 43 min Free

Description

This presentation details how Shopee collocated Hadoop YARN with Kubernetes to significantly reduce big data infrastructure costs. The talk explores the challenges of low resource utilization on Kubernetes and the complexities of co-locating online services with offline jobs. It delves into how custom extensions to the Linux Kernel, Container Runtime, Kubernetes Scheduler, and Kubelet were used to improve resource utilization while ensuring the stability of online services, and how restrictions on offline job scheduling were overcome to achieve substantial cost savings.