I received a question from Imran Momin:What are the pros and the cons of Spine-Leaf and Typical Layer 3 designs?
Before answering the question I need to provide some background. The below figure shows your traditional Data Center Layer 3 design with a Top of Rack (ToR) switch connecting to a pair of routers (running VRRP per VLAN). The routers have multiple connections to multiple ToRs. In the Core-Distribution-Access design model, the routers are acting as the Layer 3 Access and the ToRs are acting as the Layer 2 Access.
A limitation with this design is the lack of extensibility of a Layer 2 domain (think Broadcast domains or VLANs) across multiple ToR. Layer 2 domains can’t go beyond a router (this is a functional foundation of a router). A good thing with this design is that it lacks the ability to extend Layer 2 domains across multiple ToR. This is a good thing because you don’t need to run Spanning Tree Protocol (STP).
Depending on the needs of your company and the Applications that are running, you may have to extend the Layer 2 domain across multiple ToR. A way to achieve this is to make the routers Layer 3 Switches. A Layer 3 Switch is a fancy name for a switch that can also be a router. The Layer 3 Switch will be part of both the Layer 3 Access layer and the Layer 2 Distribution layer. The figure below depicts the design. This design is also called Spine-Leaf (think hub-spoke with multiple hubs). The ToR switches (Layer 2 Access) are the leafs and the Layer 3 Switches (the Layer 2 Distribution part) are the Spines.
With a Layer 3 Switch it is possible to extend Layer 2 domains across multiple ToR switches, which would meet the business and Application requirements. That is good. The downside is that you will have to configure STP.
To answers Imran’s question: The typical Layer 3 design eliminates STP but limits the ability to expand Layer 2 domains. The Spine-Leaf design provides flexibility to expand Layer 2 domains but requires STP.
However this is not the end of it. STP is a very inefficient way (per today’s Data Center standards) to mitigate Ethernet loops. Many Applications require Layer 2 domain expand-ability, and maximum path and bandwidth availability (no STP). To remove the need to run STP while expanding Layer 2 domains across multiple ToRs (the best of both designs), new protocols were created. Transparent Interconnections of Lots of Links (TRILL) was designed to remove the need for STP while making it possible to stretch Layer 2 domains. (As a side note, most Data Center network equipment vendors support TRILL or a variant of it). The figure below shows a design using TRILL.
Most TRILL implementations leverage the Spine-Leaf design. Many TRILL implementations separate the entity (physically or virtually) that is executing the Layer 3 Access from the entity executing the Layer 2 Distribution.
One of the trade-offs you will make with TRILL (and its multiple vendor variations) is costs. Your company will pay a premium for the network hardware that supports TRILL. Also, there is between little to no interoperability between different vendor’s implementation of TRILL. Finally, there is a size limitation (per vendor) of how many leafs and spines can be part of the implementation.
Elver’s Opinionregarding Imran’s question and Virtual SDN: Your Virtual Machines will be connected to an Overlay like VXLAN, NVGRE or GENEVE. These Overlays already have mechanisms to extend the Layer 2 domains across IP mediums without the need of STP. Therefore there is limited incentive or benefit to your company to implement a Spine-Leaf solution as the Underlay for your SDN solution.
FYI I left out other potential pros and cons with both Spine-Leaf and Layer 3 designs (like handling of Broadcast Storms and Multicast) that are almost no longer of practical consequence in today’s Data Center networks or that would requirement to go into a MUCH deeper explanation about network designs.