Tech Talks

Streamlining AI Deployments

Streamlining AI Deployments

MLOps World - MLOps World & Generative AI World 2024

ai llm mlops deployment optimization inference compiler pytorch docker gpu api

The State and Future of Cloud-Native Model Serving

The State and Future of Cloud-Native Model Serving

KubeCon + CloudNativeCon - KubeCon + CloudNativeCon Europe 2023

Dan Sun Theofilos Papapanagiotou

mlops cloud-native kubernetes model-serving kserve cncf knative istio serverless scalability observability inference

Efficient Access to Shared GPU Resources: Mechanisms and Use Cases

Efficient Access to Shared GPU Resources: Mechanisms and Use Cases

KubeCon + CloudNativeCon - KubeCon + CloudNativeCon Europe 2023

Diogo Guerra Diana Gaponcic

gpu-scheduling kubernetes nvidia-mig time-slicing resource-management high-energy-physics machine-learning inference ci-cd benchmarking