To be able to cope with high loads, microservices have to scale. Scalability means that a system can process more load when it gets more resources.
There are two different kinds of scalability as represented in Figure 7.16:
- • Horizontal scalability—This means that more resources are used, which each process part of the load, that is, the number of resources increases.
- • Vertical scalability—This means that more powerful resources are employed to handle a higher load. Here, an individual resource will process more load, while the number of resources stays constant.
Horizontal scalability is often the better choice since the limit for the possible number of resources and therefore the limit for the scalability is very high. Besides,
Figure 7.16 Horizontal and Vertical Scaling
it is cheaper to buy more resources than more powerful ones. One fast computer is often more expensive than many slow ones.
Scaling, Microservices, and Load Balancing
Microservices employ mostly horizontal scaling, where the load is distributed across several microservice instances via Load Balancing. The microservices themselves have to be stateless for this. More precisely, they should not have any state, which is specific for an individual user, because then the load can only be distributed to nodes, which have the respective state. The state for a user can be stored in a database or alternatively be put into an external storage (for example, In-Memory-Store), which can be accessed by all microservices.