Best Practices of Cloud DCN Deployment

This chapter describes how best to deploy the cloud DCN solution based on Huawei’s extensive deployment experience. To begin, design partitions for the DCN and carefully plan the physical network, SDN controller, and cloud platform server for each partition. It is then possible to proceed with deployment, which will include basic network pre-configuration, controller installation, controller interconnection commissioning, and service provisioning.

DEPLOYMENT PLAN

Overall Plan

The deployment plan for the new cloud DCN differs from that of the legacy DCN in the following aspects:

The SDN controller deployment plan is required for cloud DCN deployment. As the SDN controller is essentially software running on the OS, it is necessary to plan server parameters and the working network plane of the OS and SDN controller. Different network planes require different network connections. For example, the southbound network plane of the controller must connect to network devices such as switches and firewalls, for which dedicated IP addresses and routes should be planned. The northbound network plane of the controller is used for administrator web logins and interconnection with systems such as the cloud platform and VMM, for which dedicated IP addresses and routes also need to be planned. Legacy DCNs involve neither the SDN controller, nor interconnection between the SDN controller and other systems.

• A large DC is usually divided into several zones, and a cloud DC may be deployed in one or more zones. In this scenario, interconnections must be planned between multiple cloud DCs, and between cloud DCs and legacy DCs, none of which are required in a legacy DC plan.

Common User Requirements

Understanding user requirements for DCNs is critical when deploying a cloud DC based on SDN. Table 9.1 lists the common user requirements for deploying an SDN DC.

Network Zone Design

A typical large-scale DCN can be partitioned into multiple zones, each with specified functions. Each zone can then be further designed as per their functions. The following is a typical partition example.

As shown in Figure 9.1, the DC is partitioned into three zones: resource zone (production and non-production intranet zones), non-resource zone (production and non-production Internet access zones, production and non-production extranet access zones, and other network access zones), and O&M management zone.

In the resource zone, the production intranet zone carries core enterprise services, which are the most critical enterprise assets. The nonproduction intranet zone carries common office systems, non-critical services, and temporary services.

In the non-resource zone, the production extranet access zone is the egress zone that connects to the remote subnets of an enterprise. Devices such as firewalls, IPS/IDP, and egress routers are usually deployed in this zone for remote networking and security control.

In the non-resource zone, the production Internet access zone is the egress zone that connects to the Internet. Similar to the production extranet access zone, devices such as firewalls, IPS/IDP, and egress routers are generally deployed in this zone for remote networking and security control.

TABLE 9.1 Common User Requirements for Functions

Requirement

Description

High volumes of service data traffic

  • • Traffic bursts may occur during data writing and result in the replication of upper-layer applications. Consequently, a large cache is required
  • • The network is required to provide an appropriate overload ratio, which is typically 3:1 for servers and TOR switches
  • • The network must provide sufficient access bandwidth. For example, 10GE access and 40GE uplink
  • • Network latency must be less than 1 ms

Network service deployment

  • • The controller can interconnect with a mainstream cloud platform through a standard API, enabling the cloud platform to orchestrate and auto-provision services
  • • The controller can automatically discover network devices and add them to the management zone
  • • The underlay network can be deployed automatically
  • • A fabric can be divided into logical networks, which are isolated by default. Isolation zones in a fabric network cannot interwork
  • • When the VM status changes, network service resources are adjusted on demand based on scheduling by the cloud platform
  • • The controller provides GUIs for visualized network service orchestration and provisioning, and needs to provide network service orchestration in different dimensions, such as overlay, tenant, and application
  • • The controller provides GUIs to control the interconnection between the fabric and external network
  • • If a fabric network has been divided into logical networks, the controller needs to collaborate with Layer 4 to Layer 7 devices such as firewalls, LBs, and encryptors by diverting traffic. In this way, the controller manages security for the east-and-west traffic within or between tenants
  • • A fabric can be partitioned into logical networks, and their configurations are managed by multiple homogeneous or heterogeneous cloud platforms

Basic network functions

. EVPN-based VXLAN

  • • Distributed gateway
  • • IPv6

Security

  • • Application systems under different levels of protection must communicate with each other through firewalls
  • • The production network is isolated from the non-production network through different fabrics
  • • The intranet and DMZ are isolated by fabric
  • • The controller supports hierarchical rights-based user management
  • • The controller, spine nodes, and leaf nodes support authentication, authorization, and accounting (AAA)

(Continued)

TABLE 9.1 (Continued) Common User Requirements for Functions

Requirement

Description

High availability (HA)

  • • The network support system offers a high degree of availability and prevents Layer 2 loops (such as broadcast storms and unicast flooding)
  • • Broadcast storms are suppressed
  • • Load is balanced among controller cluster members. The system automatically detects a faulty controller, but the upper-layer service system will be unaware of the fault. The controller cluster can be deployed across Layer 3
  • • When all controllers are faulty, the forwarding tables of spine and leaf nodes are not affected, and traffic is forwarded as normal

High scalability

  • • The horizontal scalability of the network is improved, and the reuse efficiency of compute and storage resources is maximized in physical and virtual environments
  • • The controller can manage at least 200 leaf nodes
  • • The controller supports access and management of at least 50,000 VMs

Maintainability

  • • The controller automatically deploys software and policies onto all switches without manual intervention
  • • The controller logs user operations
  • • Various methods are available for fault locating and troubleshooting, and they enable quick locating of faulty devices, components, interfaces, and service communication paths in a fabric network

In the O&M management zone, a dedicated O&M management system is deployed to manage and monitor the entire DCN and IT applications. The network management platform, SDN controller, cloud platform controller node, and VMM monitoring node are typically deployed in this zone.

 
Source
< Prev   CONTENTS   Source   Next >