Building on-Premises MLOps for ISS Columbus Ground Operations

May 01, 2023 33 min Free

Description

In collaboration with Airbus and two German universities, the speakers supported the operations of the International Space Station’s (ISS) Columbus module with the development of anomaly detection, root cause analysis, and reconfiguration suggestion algorithms. As the ISS’s sensor data streams are not allowed in public clouds for regulatory reasons, they had to implement a bespoke integrated MLOps platform deployed on-premises to develop and run custom-built algorithms. This talk discusses how AI engineers and data scientists, with minimal prior Kubernetes knowledge, became full-fledged cloud engineers and built a system around GitLab, MicroK8s, and Kubeflow that can be deployed completely automatically. The speakers cover bootstrapping and automating platform deployment, supporting multiple Linux distributions, managing and provisioning storage, exposing services, providing IAM, and handling TLS issues.