Building Apache Druid on Kubernetes: How Dailymotion Serves Partner Data

May 01, 2023 32 min Free

Description

At Dailymotion, Apache Druid is used to serve partner data including views and monetization metrics. Druid, a distributed time-series and columnar database, excels in the OLAP world. The presentation details their experience deploying Druid on Kubernetes using the druid-operator, enabling easier lifecycle management, rolling upgrades, scaling, and integration with CNCF tools like cert-manager, external-dns, and ingress-nginx. Challenges related to availability, metadata, and evolutivity were addressed. The talk covers their past and present architectures, optimizations for caching with Memcached and Caffeine, and a zero-downtime migration process. They also discuss integration with GitOps workflows, common operations like scaling and monitoring, and future plans including ingestion improvements, multi-tenancy, and running Druid without Apache Zookeeper.