Disaster Recovery: Bringing Back Production from Scratch in Under 1 Hour Using KOps, ArgoCD and Velero

May 01, 2023 37 min Free

Description

This talk shares a real-life incident where a production Kubernetes cluster failed due to misconfiguration, leading to a complete rebuild. The presenter details how investments in GitOps, ArgoCD, kOps, and infrastructure-as-code enabled them to bring production back online from scratch in under an hour. The presentation covers the challenges faced, lessons learned, tools that didn't perform as expected, and strategies for improving disaster recovery plans and practices.