1. What are we solving here?

One of the new features of NSX-T 3.0 is the introduction of the Federation. NSX-T Federation is the idea of orchestrating objects between multiple NSX-T Manager cluster. This is the evolution of universal objects in NSX for vSphere which the objects are stretched between 2 data centers. These objects are network objects as well as security objects.

Stretching these objects is not a new concept. Before NSX-T 3.0 there is a concept of Multi-Site whereby a single NSX-T Manager cluster integrates to more than 1 compute manager (vCenter in this case). There are limitations to this design, especially in scalability and reliability.

Scalability

NSX-T Federation is a way to addresses some of these limitations on multisite deployment. One of the ways to address this limitation is the introduction of Remote TEP (RTEP). RTEP is a similar concept like TEP proxy implemented in other overlay network solutions. With RTEP, we can avoid the creation of a full-mesh tunnel between sites which will limit the scalability of the overlay tunnel.

The illustration below illustrates the full-mesh overlay tunnel between hypervisor hosts:

The illustration below illustrates the function of RTEP between edge nodes which proxied the overlay tunnel. This is more efficient, especially in a large environment. Furthermore, there is a function to emulate a jumbo frame of 9000 MTU size into a standard 1500 MTU size. So if the WAN link does not support jumbo frame MTU size, this function is beneficial. Take note that this is not efficient since the jumbo frame will be fragmented and sent multiple times. So there is a performance impact attached to this design. It is recommended to keep it at 9000 MTU size on the WAN link.

Reliability

There are 2 main use cases on doing multi-site networking:

  • Disaster Recovery (Active-Standby DC)
  • DC Extension (Active-Active DC)

In both cases, reliability is important to achieve such use cases.

If you’re familiar with NSX-T, there are 2 types of Logical Router: Tier-0 and Tier-1 Router. These 2 types of routers provide routing inside the fabric. Unlike NSX-V, edge node is not a router. Edge node is a host (transport node) that will hosts the logical router (Tier-0 or Tier-1). The logical router is like a virtual machine and the edge node is the hypervisor.

In NSX-T Federation, Tier-0 and Tier-1 Router can be stretched across sites to provide high availability as well as load sharing. Depending on the design, it provides a better way to protect these logical routers. The egress traffic can be design from both sites as well. The upstream router influence ingress traffic.

2. What is NSX-T Global Manager

To use Federation, NSX-T Global Manager is a must. NSX-T Global Manager appliance is similar to NSX-T Manager. The only difference is that when you deploy the appliance, NSX Global Manager role has to be selected.

NSX Global Manager (GM) manages global-level objects. These objects span across all the sites. We can create region-level objects as well to stretch an object into a defined region.

NSX-T Manager cluster in each site is a Local Manager. The NSX Global Manager is connected to the Local Managers and the objects created in Global Manager are synchronized to the Local Managers. Below is the illustration:

3. Installation Requirements

Documentation about Federation requirement can be found in here

4. Installation Process

The installation is straight forward. There are many blog posts out there to describe the installation process.

Source

https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.0/installation/GUID-AD369B9D-4ADC-4CE9-B8DC-BB2B47C7BFBF.html
https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.0/installation/GUID-E6C5AA1E-2C3C-42D1-B386-6C99B92E5B21.html
https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.0/administration/GUID-D5B6DC79-6733-44A7-8072-50221CF2122A.html