Low-latency Model Inference in Finance: A Close Look at Seldon V2

May 15, 2024 31 min Free

Description

Vincent David and Michael Meredith from Capital One discuss the critical role of model inference in the Fintech sector, particularly for applications like credit decisions and fraud detection. They highlight the increasing demand for high resilience and low latency, leading to the adoption of service-oriented architectures. The talk provides an in-depth look at Seldon, an open-source model serving framework built on Kubernetes, and its evolution from V1 to V2. They cover Seldon's capabilities in constructing Directed Acyclic Graphs (DAGs), traffic routing, and simplifying the deployment of real-time models. The presentation also touches upon Capital One's internal "MaaS" (Model as a Service) platform, which abstracts Seldon for end-users and adheres to strict enterprise controls and policies. The challenges and considerations for adopting Seldon V2, including its alpha status, potential refactoring needs, and the complexity introduced by Kafka for DAG orchestration, are also discussed.