The main goal of this article is to list and describe system patterns for designing machine learning systems in a production environment. Design patterns that help in the development of machine learning models that achieve certain accuracy metrics are not a priority, although for some of the patterns listed, such usecases will still be specified.
Minimum requirements
All of the machine learning system patterns listed here are intended for deployment in a public cloud environment or on a Kubernetes cluster. For the purpose of this review, we have tried to abstract as much as possible from specific programming languages or platforms, although since Python is the most significant language for machine learning technology, most of the patterns can be implemented using Python.
Patterns
Service systems design patterns
Service systems design patterns are a set of off-the-shelf design solutions that can be used to organize production workflows that involve machine learning models.
Web single pattern
Using
When you need to quickly release a predictor in the simplest architecture.
The architecture
The Web single pattern offers an architecture where all the artifacts and code for the predictor model are enclosed in a single web server. Because in this pattern the REST (or GRPC) interface, preprocessing, and trained model are all in one place (on the server), you can create and deploy them as a simple predictor.
If you need to deploy multiple replicas at once, you will have to implement a load balancer or proxy server to access them. If you use GRPC as your interface, you should seriously consider implementing a client-side load balancer or layer-7 load balancer.
To build your model into a web server, you can use the model-in-image pattern or the model-loading pattern.
Diagram
Pros of
Ability to use a single programming language, such as Python, for web server, preprocessing and logical output.
- Easy to manage because of its simplicity.
- It is easier to troubleshoot.
- Minimally time-consuming to refine the model.
- As a starting production architecture, it is usually recommended to deploy on a single web server with synchronous processing.
Disadvantages
- Since all components are encapsulated in a single server or docker image, even a small patch would require updating the entire image.
- Upgrades will also require a service deployment, which in larger organizations requires doing a serious SDLC.
What you need to think about
- Procedures for upgrades and maintenance for each component.
- Scaling involves changes to web server management.