Deployment of machine learning (ML) models, or simply, putting ML models into production, is fundamentally about bridging the gap between the research environment and live systems. Successful deployments make our models available so they can be easily accessed by both internal and external systems, depending on business requirements. Once our ML models are deployed, other systems can send input data to these models and receive back predictions. Only through effective machine learning model deployment can we maximize the business value of the models we build. When we think about data science, we think about how to build machine learning models. We think about which algorithm will be more predictive, how to engineer our features and which variables to use to make the models more accurate. However, the “last mile” of planning how to use the models in production is often neglected, despite its critical importance. Machine learning systems have all the usual challenges of software development, combined with additional data science-specific challenges, which means that deployments and system architecture require careful planning. This is a realisation that many individuals and organisations make when it is too late.
In this talk, we will discuss the steps and challenges involved in putting a machine learning model into production. We will cover setting up an effective machine learning pipeline for feature engineering, feature selection and model building. We will describe the architecture of the research and production environments and how they can be connected. We will highlight the challenges to obtaining reproducible models between the two environments and how to ensure reproducibility. Finally we will present a machine learning pipeline solution that tackles these problems.