Streamlining AI Deployments

December 09, 2024 8 min Free

Description

Vasilis Vagias, Head of AI Solutions at CentML, discusses how to streamline the process of deploying and optimizing large language models (LLMs) and AI/ML solutions in production. The talk highlights CentML's platform, which offers optimizations at various levels, from hardware to the serving layer, and includes features like automated pipeline parallelism and speculative decoding. It also touches upon the app.ml.com platform, free credits for users, serverless model offerings, and integration capabilities for on-premise deployments.