Constructing a Multi-DC Network
As services are deployed across regions, distributed multiple DCs become increasingly popular. Service subsystems are deployed across DCs, so the DCs support DR and load sharing, and provide the nearest services for users in different locations, improving user experience. This chapter focuses on Huawei multi-DC solution, which comprises multi-site and multi-PoD solutions. The multi-site solution is applicable to long-distance multi-DC construction and service collaboration between networks managed by multiple controllers. The multi-PoD solution is mainly developed for building multiple DCs within a short distance and using one set of controllers for unified management.
The preceding sections cover the design and technical principles of the physical network (underlay network) and logical network (overlay network) on a single DCN. This section covers service scenarios and customer requirements in the multi-DC scenario.
MULTI-DC SERVICE REQUIREMENTS AND SCENARIOS
With the development of the Internet, cloud computing, and big data, virtualization and resource pooling have become mainstream requirements. Resources cross regions and DCs need to be integrated to form a unified resource pool. In addition, deployment of service systems is distributed across multiple DCs to form the multi-active model. Services can be provided close to the user, improving user experience. Therefore, the distributed multi-DC solution has become the mainstream solution. Currently, service scenarios of multiple DCs include cross-DC service deployment, geo-redundancy, network-level DR, and distributed cloudification:
1. Cross-DC service deployment
Typically, an application needs to be implemented by multiple or even hundreds of subsystems. Because the scale of a single DC is limited, one DC cannot accommodate all subsystems. Therefore, different subsystems of applications are deployed in different DCs. In this case, this application is deployed across multiple DCs. Additionally, different subsystems provide different functions. Some subsystems need to be deployed in multiple DCs in distributed mode, and some need to be deployed in centralized mode. As such, the entire service system is deployed across DCs. For example, in the following scenario, the web, app, and database subsystems are deployed in DC1, DC2, and DC3, respectively. The web subsystem invokes the app subsystem, and the app subsystem invokes the database subsystem. Different subsystems need to communicate with each other at Layers 2 and 3 across DCs to ensure applications are operating normally. Figure 7.1 shows the cross-DC service deployment scenario.
In this case, the network needs to provide interworking capabilities between DCs to ensure smooth interaction at the service layer.
FIGURE 7.1 Cross-DC service deployment.
In most cases, the geo-redundancy DC solution indicates that a remote DR DC is added to synchronize data with two active-active (primary) DCs in the same city.
Two active-active DCs in the same city: The same two service systems are deployed in two DCs in the same city. Traffic is routed to application servers in different DCs through load balancing. The two service systems run in the two DCs in the same city and provide services for users, doubling service capabilities and performing DR in real time. If the service system in a DC fails, the service system in the other DC continues to provide services, greatly improving the continuity and reliability of services. In this situation, users are unaware of faults. Subsystems in different DCs need to communicate with each other at Layer 2 and Layer 3, and security policies of the same subsystem must be consistent. The DCs provide the same services in active-active mode.
The remote DR DC is the backup of the two active-active DCs in the same city and is used to back up the data, configurations, and services of the two active-active DCs. If a fault occurs in the two active-active DCs due to natural disasters, the remote DR DC can quickly recover data and applications to ensure normal service operation and reduce loss.
In Figure 7.2, multiple VPCs are deployed in primary DCs A and В to carry the same services. In addition, subsystems in primary DCs A and В communicate with each other at Layer 2 and Layer 3, and the two DCs provide the same services externally. In this case, the network needs to meet requirements of DCI. The DR DC needs to synchronize its status with the primary DCs in real time. Therefore, the network needs to satisfy requirements of interconnection between the DR DC and primary DCs.
3. Network-level DR
A large number of applications provide services using cluster software. Multiple servers on a network are associated by the cluster software and appear as a logical server to the rest of the network to provide consistent services. In a cluster, multiple servers operate in load balancing mode to improve the overall service processing capability of the cluster. In addition, these servers support each other to improve
FIGURE 7.2 Geo-redundancy scenario.
reliability in the system. When servers in a cluster are deployed in different DCs, if a DC is faulty, servers in other DCs in the cluster can provide services to perform DR on application systems across DCs.
The cluster software of most vendors requires Layer 2 interconnection between servers; therefore, deployment of server clusters across DCs requires the network to provide large Layer 2 network capabilities across DCs. In addition, the cluster uses a virtual IP address (VIP) to provide external services, and the VIP is externally advertised through the frontend network of the DC. For this reason, the network needs to provide cross-DC gateways for the VIP of the cluster. These cross-DC gateways can work in active/standby or active-active mode, with the active and standby gateways advertising active and standby routes. Normally, north-south traffic is transmitted through the active gateway of the active DC based on the active route. However, if the active DC is faulty, services are switched to the standby route and traffic is forwarded through the standby gateway in the backup DC. The active-active gateways advertise ECMP routes externally. In addition, north-south traffic is load-balanced to two
Constructing a Multi-DC Network ■ 215
FIGURE 7.3 Network-level DR scenario.
DCs based on ECMP routes. If a DC is faulty, traffic is switched to the gateway of the other DC. Firewalls must protect the north-south traffic of the cluster and can be deployed in active/standby or active- active mode, as shown in Figure 7.3.
4. Distributed cloudification architecture
Distributed cloudification involves services that are deployed in multiple DCs in a distributed manner. Each DC can bear traffic in real time and provide services. Multiple DCs are interconnected through the DCI backbone network to form a unified resource pool for real-time synchronization of data, and services can be directly switched if any site fails. Multiple sites work in active-active mode, and edge DCs provide services close to users, ensuring a short latency and good user experience.
The central DC is the main source of data and pushes the website content to edge DCs through the backbone network. Subsequently, the edge DCs send the content to users. During this process, multiple DCs need to communicate with each other at both Layer 2 and Layer 3, as shown in Figure 7.4.