SMLT and RSMLT Deployment Guide V1.1
Transcription
SMLT and RSMLT Deployment Guide V1.1
> BUSINESS MADE SIMPLE SMLT and RSMLT Deployment Guide V1.1 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Copyright © 2004 Nortel Networks. All rights reserved. NORTEL NETWORKS CONFIDENTIAL: The information contained in this document is the property of Nortel Networks. Except as specifically authorized in writing by Nortel Networks, the holder of this document shall not copy or otherwise reproduce, or modify, in whole or in part, this document or the information contained herein. The holder of this document shall keep the information contained herein confidential and protect same from disclosure and dissemination to third parties and use same solely for the training of authorized individuals. Information subject to change without notice. Nortel, the Nortel logo and Secure Router are trademarks of Nortel Networks. © Nortel External Distribution 2 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Table of Contents INTRODUCTION.................................................................................................................... 4 BACKGROUND..................................................................................................................... 5 Spanning Tree and 802.1w ............................................................................................................5 Equal Cost Multi-Path (ECMP) ......................................................................................................6 Link Aggregation (802.3ad) ...........................................................................................................7 SPLIT MULTI-LINK TRUNKING (SMLT) WITH VRRP ......................................................... 8 VRRP Engineering with SMLT.......................................................................................................9 Backup-Master Operation (Dual Active IP Gateways) ..............................................................10 Performance and Engineering ....................................................................................................11 ROUTED SPLIT MULTI-LINK TRUNKING ......................................................................... 12 Layer 2 Access Configuration.....................................................................................................12 Layer 3 Configuration ..................................................................................................................13 Deployment Scenarios.................................................................................................................14 At the network edge ................................................................................................................... 14 Between Distribution and Core Layer ........................................................................................ 16 Within the Core .......................................................................................................................... 16 BGP Peering .............................................................................................................................. 16 Performance and Engineering ....................................................................................................17 DESIGN RECOMMENDATIONS......................................................................................... 18 General Engineering ....................................................................................................................18 Layer 2 Resiliency Engineering ..................................................................................................18 Layer 3 Resiliency Engineering ..................................................................................................19 Fail-over Performance .................................................................................................................19 © Nortel External Distribution 3 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Introduction The ERS8600 Design Guide (document number 31397-D) provides preliminary guidelines and design scenarios for RSMLT and VRRP, and the purpose of this guide is to complement the information in the ERS8600 Design Guide by providing an in-depth look at RSMLT and VRRP; including performance, scalability, design examples, and engineering guidelines and recommendations. This guide also covers some of the typical comments and questions in regards to SMLT/RSMLT, such as: • • • • RSMLT performance versus ECMP Standardization of SMLT 802.1w performance versus SMLT Preventing loops with SMLT and RSMLT This guide assumes an audience with a strong understanding of LAN Switching technologies and protocols as well as a general understanding of Nortel’s Split-Multi-Link Trunk (SMLT) solution. © Nortel External Distribution 4 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Background Network resiliency is a top consideration when designing an IP network. Many customers demand 5x9s network availability because they have migrated, and/or are in the process of migrating mission critical applications including VoIP, financial, surveillance, etc. to their IP network. Several key protocols and technologies exist which are deployed to design network resiliency in Local Area Networks including Spanning Tree Protocol (STP), Virtual Router Redundancy Protocol (VRRP), Equal Cost Multi-Path (ECMP) routing, and Link Aggregation/Multi-Link Trunking (MLT). To make an informed decision on what protocol(s) to use, it is important to have an understanding of how and why each of these protocols were developed, what issue(s) they address, and how they have evolved. Note: The Nortel LAN Switching portfolio supports all of the above mentioned technologies and protocols. When it comes to network availability from a business perspective, the critical factor is end-user uptime and not network uptime. Network resiliency plays a big role, as 1 second can make a significant difference when it comes to end-user availability. For example, if there is a link failure in the network and it takes 3 seconds for the user traffic to reroute to other links in the network, the enduser of a VoIP conversation, as an example, will see a 3 to 4 second interruption to the conversation. However if it takes 5 seconds for the traffic to reroute to other links in the network, then the VoIP session will very likely get disconnected, the IP phone will have to re-register with the server, and the end-user will have to manually re-establish the phone call. The same could apply to a data session whereby the user can lose data entry and has to manually re-establish communication with the application/server and re-enter lost data. Effectively, the 2 extra seconds of network downtime can result in an exponential increase in end-user downtime. Industry best practices are such that a network should be designed with traffic rerouting capabilities of less than 3 to 5 seconds. This is the typical tolerance level for many IP applications. Spanning Tree and 802.1w The 802.1D Spanning Tree Protocol (STP) standard was developed many years ago with one main focus; prevent bridging loops in the LAN. However there are three fundamental problems with Spanning Tree: 1. Slow Convergence Depending on the size of the network, if there is a link or a switch failure, it can take the network several minutes to converge. 2. Idle bandwidth To prevent bridging loops, STP forces some links in the network to go into a blocked/unused state; hence resulting in much of the network bandwidth sitting idle. In some cases, more than 30% of the bandwidth in the network can be idle. 3. Design Complexity The introduction of multiple Spanning Tree Groups (STG) allows network engineers to minimize the amount of idle bandwidth in the network by configuring multiple STGs and controlling which links are blocked for each and associated traffic. This introduces significant complexity as each STG requires the selection of a root bridge. In addition, link costs must © Nortel External Distribution 5 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 be adjusted as it is the only means of engineering a deterministic solution. Finally, the most difficult challenge is to ensure traffic is balanced across the network links. Several years after STP was widely deployed, some vendors implemented proprietary extensions to the STP protocol to achieve faster convergence; an example of this is the FastStart feature on the Nortel switching products. More recently however, the 802.1w Rapid Spanning Tree (RSTP) standard was developed, to address the issue of slow convergence with legacy STP. 802.1w is based on the same foundation and architecture as the legacy 802.1D STP protocol, but it provides faster convergence. RSTP convergence is typically sub-second when a link fails and/or recovers; however there are several failure scenarios which can result in outages in excess of the 5 second timeout boundary for most IP applications. Two such failures include the failure of the root bridge and a far-end link failure. From a bandwidth perspective, 802.1w has the same issue as legacy Spanning Tree whereby many of the links go into a blocking state in order to prevent bridging loops. To make better use of the bandwidth in the network, the 802.1s (Multiple Spanning Tree Groups) standard can be used; however this is similar to the legacy STG implementation whereby the design gets very complex with challenging administration and operation. Nortel recommends that Spanning Tree and its associated protocols (802.1w and 802.1s) be avoided within the Distribution and Core layers however Nortel does recommend using the Spanning Tree protocol on all end station Access connections to safeguard the network from hubs or other devices that could be inserted into the network at the end station. A modification to the normal learning of spanning tree is employed in all Nortel edge switches. This feature is known as Fast Start or Fast Learning, and is the recommended setting for all end station Access ports. Never enable Fast Start/Learning on any uplink ports; this will cause loops in the network and therefore could have unexpected affects on the entire network. Equal Cost Multi-Path (ECMP) Following the implementation of basic bridged/STP networks came the development of the routing protocols including RIP, OSPF, and BGP. This was a big leap in terms of scalability, resiliency, optimization, simplification, etc however these protocols were unable to load-share traffic across multiple links resulting in inefficient use of bandwidth. All routing protocols have knowledge of all paths in the network including redundant paths however the original selection criteria was such that only the best route would be populated in the routing table. In addition, although routing convergence may be faster than Spanning Tree and is less disruptive it still does not address the sub 5 second requirement for mission critical applications. Equal Cost Multi-Path (ECMP) addresses many of these issues and has wide support across most products and vendors. ECMP effectively allows multiple routes of same cost but with different next-hops to exist simultaneously in the routing table. Traffic is load-shared across the multiple paths automatically via a hashing algorithm. This inherently provides ECMP with improved failover capabilities (sub second) as multiple active routes already exist in the event of a failure. Note: Additional extensions to some ECMP implementations provide non-symmetrical load-balancing capabilities. ECMP is widely deployed and does provide effective load-balancing and resiliency however there are several issues that must be considered including: © Nortel External Distribution 6 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 1. Inability to extend L2 Subnets In many enterprise customer networks, especially those which have some non-IP applications such as DecNet, SNA, proprietary applications, etc., it is essential to be able to extend L2 subnets across certain parts of the network, or even across the entire network. It is anticipated that these non-IP based applications will exist for quite some time. Implementing ECMP forces the use of layer 3 routing between devices; hence it will not allow extension of L2 subnets. 2. Scalability ECMP impacts the size of the routing table significantly. For example, if ECMP is used and there are 4 links in the ECMP link-group then every destination will have 4 entries in the routing table, quadrupling the size of the routing database; hence, requiring more memory and more processing resources. 3. Long Convergence for Specific Failure Scenarios Although ECMP convergence is typically sub-second, if there is a far-end link failure (common in LAN extension services offered by service providers, or found in optical connections where Far End Fault Indication (FEFI) and/or Remote Fault Identification (RFI)) rerouting of the traffic to other links in the ECMP-group will depend on the convergence of the unicast routing protocol. Typical reconvergence times for RIP and OSPF are 180 and 60 seconds respectively. 4. No Switch Redundancy For network environments requiring 5x9s availability, switch redundancy is a must, especially in the core of the network. Unfortunately the ECMP standard does not allow the links in an ECMP link-group to span across multiple switches i.e. the end-point of all the links in an ECMP link-group must terminate on the same switch. Link Aggregation (802.3ad) The 802.3ad Link Aggregation Protocol standard provides a mechanism to aggregate multiple links to form one logical link. This Multi-Link Trunking function (MLT) provides both improved bandwidth beyond that of one link and inherent resiliency. Traffic is load-balanced automatically across the aggregated links using a deterministic hash. Any failure of a member link within the aggregate trunk will immediately be detected and removed from the hash resulting in sub-second failure. Link aggregation sits between layer 1 and 2 making it transparent to layer 2 and layer 3 traffic. As the 802.3ad standard was being developed many vendors were implementing their own proprietary form of MLT protocols in parallel. However many of the proprietary MLT implementations can interoperate in “static mode” whereby there is no end-to-end handshaking between the far-ends of the MLT links. As long as the MLT is treated as a single logical link, it is largely irrelevant how the hashing is performed. The two endpoints of the MLT can implement different hashing algorithms. 802.3ad does not really specify a “static mode” implementation. 802.3ad uses Link Aggregation Control Protocol (LACP) which is responsible for end-to-end handshaking between the two ends of the trunk. LACP is responsible for many functions; including link ordering, assigning active and standby links, detecting far-end link failures, etc. From a traffic load-sharing perspective, vendors deploy different hashing algorithms. The most common hashing algorithm is based on an XOR/MOD of certain fields in the source/destination IP and/or MAC addresses. Some vendors deploy loadsharing and round-robin hashing algorithms whereby traffic load-sharing is done on a packet-bypacket basis. The advantage of these hashing algorithms is that they result in a much better balance of traffic load across the links in a link-group; however implementing such hashing algorithms could cause out-of-sequence packets, leading to serious connectivity problems at the session layer. © Nortel External Distribution 7 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 802.3ad has gained momentum and is currently widely deployed and supported. However 802.3ad has one major shortfall; it does not have any provisions to allow the links in the Link Aggregation Group to span more than one end-device (switch/server). As a result, designing networks for 5x9’s availability, where switch redundancy is required, becomes a challenge that likely requires a combination of 802.3ad and spanning tree or ECMP. To address this gap and provide an ultimate resiliency solution in terms of simplicity, scalability, convergence, Nortel developed an extension to the Link Aggregation standard. This extension is called Split-MLT (SMLT) and will be described in subsequent sections. SMLT is based on the 802.3ad standard and interfaces transparently to any vendor switch or sever, regardless of whether they support the 802.3ad standard or their own proprietary form of static MLT implementation. SMLT provides unparallel resiliency by allowing the links in the 802.3ad/MLT trunk to span more than one switch. SMLT has been generally available for several years now and is widely deployed by very large enterprise customer deployments in the financial, education, transportation, and healthcare industries. Split Multi-Link Trunking (SMLT) with VRRP SMLT is currently supported on the Nortel Ethernet Routing Switch 8600 as well as ERS 1600 & 5500 series and will be supported on other Ethernet Routing Switches such as ERS 8300 in future releases. SMLT logically aggregates two Ethernet Routing Switch nodes to form one logical entity known as a Switch Cluster (SC). The two peer nodes in a SC are connected via an Inter-Switch Trunk (IST). The IST is used to exchange forwarding and routing databases between the two peer nodes in the Switch Cluster. SMLT is self-contained with SMLT handshaking occurring within the Switch Cluster only. SMLT is completely transparent to access switches which connect to the Switch Cluster using 802.3ad standard trunking, or using their own proprietary static MLT implementation. The diagram below illustrates both the physical and logical connectivity in a typical SMLT implementation. Figure 1: Split Multi-Link Trunking (SMLT) Concept SMLT provides a simple, scaleable, active-active design, whereby all the links and all switches are actively forwarding traffic. Traffic is automatically balanced across the links and nodes using a standard hashing algorithm eliminating any packet duplication. It uses a light weight protocol © Nortel External Distribution 8 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 requiring very little CPU resources; and in the event of a link failure or a complete switch failure, provides sub-second recovery. Since SMLT load-balances all resilient paths using a deterministic hash, it eliminates any packet duplication hence eliminating the need for spanning tree. SMLT uses a Loop-Detect feature and a CP-Limit feature to detect and contain bridged loops in an SMLT network; refer to the ERS8600 Design Guide for details. VRRP Engineering with SMLT From a Layer-2 perspective, if any of the links in the SMLT trunk or IST, or if any one of the peer nodes in the Switch Cluster fails, traffic is switched in sub-seconds to the other links/switches. While this provides layer 2 resiliency from a layer 3 perspective each 8600 must have an associated IP for routing (assuming a layer 3 design) in each active VLAN configured with SMLT. Nodes on a VLAN using SMLT will typically use a single default gateway pointing to only one of the two 8600s in the SC. As such VRRP is used to provide a common default gateway address. The diagram below provides an illustration of how traffic is forwarded with SMLT/VRRP. Figure 2: SMLT Traffic Forwarding with VRRP In the above example, the top Access switch relies on the MLT hash to select the link towards the Switch Cluster. Since the selected link is the VRRP Backup, the packets must be switched by the Backup to the Master via the IST. Further, if the bottom Access switch had only a single link connecting to the left Distribution switch the packets would need to traverse the IST again. This results in unnecessary bouncing of traffic causing extra bandwidth consumption and delay. To optimize traffic switching/routing, Nortel uses an extension to the VRRP standard, called VRRP Backup-Master. © Nortel External Distribution 9 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Backup-Master Operation (Dual Active IP Gateways) The IST allows the 2 devices in the SC to exchange all MAC address tables. When VRRP BackupMaster is enabled on the previous “Backup only” peer node, it allows that switch to forward the traffic to the destination directly on behalf of its peer, even though it is operating as a VRRP backup. The diagram below provides an illustration of how packets flow in the case where VRRP Backup-Master is enabled. Figure 3: SMLT Traffic Forwarding with VRRP Backup-Master Given the same example described earlier, if VRRP Backup-Master is enabled on the left distribution switch, when the packet is hashed through the Access switch MLT, the Backup-Master will forward the packets directly to the destination. Since VRRP Backup-Master is simply an extension to the standard VRRP protocol, control traffic is exchanged using multicast traffic; hence the virtual IP address can span the entire L2 network and consequently the VRRP Backup-Master feature will work anywhere along the entire subnet. For example, in an SMLT square design (refer to the ERS8600 Design Guide for details on SMLT-square designs), the virtual IP address can span all 4 nodes in the SMLT-square core whereby one of the nodes is the VRRP Master and the other 3 nodes are Backup-Master. This provides optimal switching/routing paths within the SMLT-square and eliminates the need for multiple VRRP instances hence will simplify the network design, implementation, and operations significantly. By default, VRRP control information is exchanged every 3 seconds; however with the implementation of Backup-Master, this timer is irrelevant because the VRRP backup can forward traffic when Backup-Master is enabled. It is recommended that VRRP Backup-Master is used with SMLT configurations to optimize traffic flow and fail-over time. However for customers not wishing to use SMLT and VRRP Backup-Master, Nortel has introduced the concept of VRRP-Fast whereby the user can adjust the VRRP timers to provide sub-second convergence for VRRP. © Nortel External Distribution 10 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Performance and Engineering VRRP control traffic typically consumes about 4kbps of bandwidth for every 50 VRRP instances configured. As such even at maximum scalability of 255 VRRP instances on the ERS8600 there is minimal control traffic (i.e. approximately 20kbps). Enabling VRRP-Fast with default timers of 200ms effectively increases this control traffic by 5 times. This is expected as the default timers for VRRP with VRRP-Fast disabled are 1 second. Nortel recommends that Spanning Tree and its associated protocols (802.1w and 802.1s) be avoided within the Distribution and Core layers however Nortel does recommend using the Spanning Tree protocol on all end station Access connections to safeguard the network from hubs or other bridging devices that could be inserted into the network at the end station. A modification to the normal learning of spanning tree is employed in all Nortel edge switches. This feature is known as Fast Start or Fast Learning, and is the recommended setting for all end station Access ports. Never enable Fast Start/Learning on any uplink ports; this will cause loops in the network and therefore could have unexpected affects on the entire network. When using SMLT to connect the Access to the Distribution/Core, always disable Spanning Tree on the uplink ports/MLT of the edge switch. © Nortel External Distribution 11 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Routed Split Multi-Link Trunking Routed-SMLT (RSMLT) is an enhancement to SMLT enabling the exchange of Layer 3 information between peer nodes in a Switch Cluster for unparalleled resiliency and simplicity for both L3 and L2. Layer 2 Access Configuration In network designs where Layer-2 access/closet switches connect to a Switch Cluster, VRRP is typically used with Backup-Master as previously descried. In this type of L2 access design, RSMLT can be used to provide the same function as VRRP with Backup-Master. This is done simply by enabling RSMLT on the VLAN/subnet on both peer nodes and using one of the IP addresses of either peer node as the default gateway. The diagram below provides an illustration of how RSMLT works in a Layer-2 access configuration. Figure 4: RSMLT for Layer 2 Access In the above example, hosts are configured with a default gateway pointing to 10.1.1.1. The MLT hash at the Access switch however forwards the packets towards the left Distribution switch (10.1.1.2). Since both nodes are configured as RSMLT peers, 10.1.1.2 will forward the packets directly to the destination on behalf of its peer. © Nortel External Distribution 12 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Layer 3 Configuration The primary reason for developing RSMLT was to address Layer 3 implementations. The below figure shows a typical example of this type of configuration whereby the Distribution switch connects to the Switch Cluster via SMLT with OSPF enabled between the Distribution switch and the Core switch. From the Distribution switch perspective there will be 2 OSPF adjacencies formed; one with 10.1.1.1 and a second with 10.1.1.2. OSPF will select one of the 2 Core switches as the next-hop for upstream traffic (i.e. 10.1.1.1) however the packets may be hashed by the MLT to the left Core switch (10.1.1.2). Without RSMLT, packets would have to be sent across the IST trunk to the second Core switch (i.e. 10.1.1.1). As with SMLT it is possible that the destination switch resides off only the left Core switch. As such packets would need to traverse the IST a second time. This example shows the worst-case scenario of what would happen without RSMLT. Figure 5: Generic Layer 3 Connectivity The diagram below illustrates how RSMLT addresses the issue described above. In this case, when RSMLT is enabled on the VLAN/subnet 10.1.1.x, as the packets are hashed by MLT at the Distribution switch towards the left Core switch, the packets are forwarded directly. © Nortel External Distribution 13 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Figure 6: RSMLT for Layer 3 Connectivity Note: RSMLT provides full resiliency eliminating the need for ECMP and/or any reliability provided traditionally with a routing protocol. The decision to deploy dynamic routing should therefore be based solely on the desire to provide autonomous route table population. Deployment Scenarios RSMLT provides the ultimate resiliency solution in terms of flexibility, simplicity, and performance. This section highlights a few common networking examples of where RSMLT can be deployed; including: • • • • At the network edge (regardless of whether the edge/closet switches are L2 or L3 switches) Between the distribution layer and the core layer Within the core of the network BGP peering with other networks At the network edge In general Layer 3 should be deployed at the edge in a large campus environment with many enduser devices, with Layer 2 in smaller network environments. There is no clear-cut boundary in terms of how many end-devices would require L2 versus L3 switching in the closet; however the suggested number is about 3000 users. In large campus environments, L3 may be more expensive but it will scale much better and will provide an architecture that is easy to manage and operate. If the number of users is less than 3000, and there is no compelling reason to implement L3, Layer 2 switches in the Access will minimize the network cost and simplify the network design. The diagram below provides an illustration of where RSMLT would be deployed at the edge of the network regardless of whether the access layer is L2 or L3. © Nortel External Distribution 14 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Figure 7: General SMLT/RSMLT Architecture The Access switches connect to a Distribution layer which consists of an SMLT Switch Cluster. If the Access switches are Layer 2 switches, then the Switch Cluster in the aggregation layer can either use VRRP with Backup-Master or RSMLT. RSMLT provides a simple, more scalable solution than VRRP. The Access VLANs/subnets terminate on the Distribution Switch Cluster(s); with each subnet assigned an IP address on the SMLT peer nodes. RSMLT is enabled on the peer nodes for all the user subnets. The gateway for the end-user devices can be the IP address of either Distribution switch for that respective subnet. Note: The default RSMLT hold timer is 180 seconds, which is designed for interconnecting to Layer 3 switches; however when connecting Layer 2 switches to an RSMLT Switch Cluster, the hold-up timer must be configured to 9999, which will allow a node in the Switch Cluster to forward traffic indefinitely on behalf of its peer. If the Access switches are Layer 3 switches, there are two ways to use RSMLT. The first relies on dynamic routing between the Access switches and the Distribution Switch Cluster. For such configurations, the default hold-up timer (180 seconds) does not have to be changed because in the © Nortel External Distribution 15 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 event that one of the Switch Cluster nodes in the Distribution layer fails, the routing protocol will remove that neighbor from the routing tables within 180 seconds (worst-case with RIP). An alternative approach relies on statically defined routes. In this scenario, RSMLT must be configured the same as it would be for a L2 access configuration (i.e. RSMLT enabled on both Distribution peer nodes for the routing subnet with hold timer configured to 9999 (infinity)). Regardless of which L3 solution is used, with RSMLT any VLAN/subnet can be extended L2 throughout the entire campus infrastructure. Also, only a single entry per path is present in the routing table, unlike ECMP which relies on multiple routing entries for every destination. Between Distribution and Core Layer For very large campus designs (i.e. those consisting of 5000+ users) it is highly recommended to design a 3-tier architecture as it will enable better scalability and overall management of the network. In a typical 3-tier design which consists of a Core layer, Distribution layer, and an Access layer, the connection between the Core and Distribution is almost always routed (Layer 3). The use of RSMLT between these layers provides the ultimate resiliency solution. From an implementation perspective, typically the Distribution layer consists of a series of SMLT Switch Clusters, which connect to an SMLT Switch Cluster in the Core, essentially forming an SMLTsquare configuration. In such a configuration, a routing subnet (typically OSPF) is configured between the layers, forming fully-meshed OSPF neighbors in the SMLT-square. RSMLT is enabled on the Distribution Switch Clusters as well as on the Core Switch Cluster for the routing subnet. The default RSMLT hold-up timer (180 seconds) does not have to be changed as OSPF will remove that any failed interfaces from the routing tables well before 180 seconds. Within the Core A typical campus Core consists of an SMLT-square with one Core Switch Cluster connecting to the user Distribution layer and the other Switch Cluster connecting to the server Distribution layer. In some cases, the Switch Clusters in the Core may be used to serve two different campuses or two different Data Centers. In such a configuration, the users and/or servers are typically connected to different subnets, and it is not necessary to extend the subnets across the entire SMLT-square. Rather, it is recommended to keep the subnets isolated for security reasons and management. In such network designs, RSMLT presents an ideal solution. From an implementation and operations perspective, this is very similar to the RSMLT configuration between the Distribution and the Core layer previously described whereby a routing subnet (usually OSPF) is configured between the two Switch Clusters in the Core forming a fully-meshed OSPF design in the SMLT-square. RSMLT is then enabled on the peer nodes in both Switch Clusters to provide optimum routing with sub-second resiliency. BGP Peering RSMLT can also be implemented very effectively when doing BGP peering between two networks. BGP convergence can be extremely demanding on CPU resources and may take significant time to converge, especially when exchanging full Internet routing tables. The diagram below provides an illustration of this type of a configuration. © Nortel External Distribution 16 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Figure 8: Resilient BGP Peering with RSMLT From an implementation perspective, the ISP provides a standard 802.3ad trunk to the enterprise network, which terminates on an SMLT Switch Cluster. A BGP subnet is configured between the ISP and the enterprise customer with RSMLT enabled on the peer nodes in the Switch Cluster. The benefit of this configuration is that the 802.3ad trunk provides a logical connection between the ISP and the enterprise network, and the RSMLT peering provides a logical aggregation of the Core nodes. Since RSMLT provides all the resiliency required loss of a BGP peer or changes in BGP state due to local link and switch failures is virtually eliminated resulting in sub-second failover and minimizing long failover delays associated with BGP convergence. Performance and Engineering RSMLT uses a very light weight protocol to exchange database information between the peer nodes; hence the amount of RSMLT control traffic is negligible at around 200 bytes/second regardless of the number of VLANs configured. There are no architectural nor configuration limits defined for RSMLT. A user can configure as many RSMLT instances on the ERS8600 as there are IP addresses. The default limitation is 500 IP instances; however with the MAC address extension kit, the number of IP instances goes up to 1980. Since RSMLT is a light-weight protocol, it is much less intensive on CPU resources than VRRP; hence it is much more scalable than VRRP. © Nortel External Distribution 17 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Design Recommendations Many of the generic design guidelines for SMLT, VRRP and RSMLT, such as the number of MLT groups, number of links per group, timers, CP-limit option, STP interaction, SLPP operation etc. are captured in the ERS8600 design guide. Please refer to this directly for this type of information. General Engineering The following are some general engineering considerations and recommendations for SMLT and RSMLT deployments: • The uplinks out of a wiring closet must utilize DMLT whenever possible by terminating each of the separate physical links on different switches within the stack or different modules within the chassis. • Nortel recommends disabling Spanning Tree protocol on MLT/DMLT group and ports on both Access and Distribution/Core switches. This is absolutely required when using Split Multi-Link Trunking (SMLT) in the Distribution and Core layer. • Enable Loop Detect on SMLT ports for added protection. Do not enable Loop Detect on the IST ports. Be sure to enable Loop Detect with the action of port down as opposed to the action of VLAN down. This feature disables the port where a MAC incorrectly shows up (looping port) due to MAC flapping between the correct port and the looping port. Note Simple Loop Prevention Protocol (SLPP) in release 4.1 will provide this functionality. Layer 2 Resiliency Engineering Layer 2 is typically used between the Access and Distribution layers in the case of a 3-tier network design; or between the Access and Core layers in the case of a two-tier network design. Both options of SMLT with VRRP and RSMLT exist. The following are some guidelines to be considered when deciding on whether to use RSMLT or SMLT with VRRP. • The maximum supported VRRP instances per node is 255. Scaling VLANs/subnets beyond this requires RSMLT. • RSMLT does not allow the configuration of a virtual IP address. From an operational and technical perspective, RSMLT provides the same functionality as VRRP, and either of the RSMLT peer node addresses would provide the same function as the VRRP virtual IP address. • Customers using static IP addresses for servers and hosts wishing to extend the virtual IP address (essentially the default gateway of the end-user devices) across more than just the peer nodes in a Switch Cluster (for example; across an SMLT-square configuration) must use VRRP with Backup-Master. RSMLT only works across the IST in the same SwitchBlock and cannot be extended. • When using VRRP with Backup-Master, it is highly recommended to load-balance the VRRP Master and Backup-Master instances across the peer nodes in a Switch Cluster as the Backup-Master end of the VRRP requires more processing power. © Nortel External Distribution 18 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 • If there are less than 255 subnets at the edge of the network, either VRRP with BackupMaster or RSMLT can be used: o From an administrative/configuration perspective, RSMLT is much easier to configure. VRRP requires a VRRP instance on each of the switches in the Switch Cluster, and the appropriate priorities assigned to define the master and backup ends of the VRRP. o If the SMLT nodes are running services and protocols which demand a high amount of CPU resources, (i.e. large OSPF routing tables, STP, etc) it is recommended to use RSMLT. It is recommended to use RSMLT if the number of subnets is more than 150. • When using VRRP with an SMLT configuration, Nortel highly recommends the use of VRRP Backup-Master. However if the customer environment does not allow any non-standard implementation nor extensions to the standard, such as SMLT and VRRP Backup-Master, the customer can enable VRRP-Fast to expedite the fail-over time from 3 seconds to subsecond. However VRRP-Fast demands much more CPU resources and is not recommended for more than 100 subnets per switch if that switch is running other services (OSPF, RIP, STP). • For Layer 2 access configurations using RSMLT, the hold-up timer should be configured to 9999 to allow the node in the Switch Cluster to forward traffic indefinitely on behalf of its peer. Layer 3 Resiliency Engineering In a Layer 3 environment either ECMP or RSMLT can be deployed for resiliency. RSMLT requires only a single entry per route and is much less demanding on CPU and memory resources for calculating and storing routing tables. Also, since the RSMLT trunk behaves as one logical link, and the nodes in the RSMLT Switch Cluster behave as one logical switch, routing convergence will never take place in the event of a link failure or even a switch failure. Finally, RSMLT allows Layer 2 subnets to be transparently extended over the RSMLT trunk for applications and environments where this is required. • If dynamic routing is enabled across the RSMLT trunk, the default hold timer of 180 seconds should be sufficient to allow the worst-case routing protocol convergence (RIP) to converge. • If static default routes are used over the RSMLT trunk, the hold timer must be configured for 9999 on the RSMLT peers. • When running RSMLT, ECMP should be disabled as it will result in additional unnecessary entries in the routing table. Fail-over Performance There is no difference in terms of fail-over capabilities between SMLT and RSMLT. In the event of any link failure in the IST or RSMLT trunks or in the event of a failure of any switch in the SMLT Switch Cluster, the traffic will reroute to other links/switches in less than 1 second. Even in very large configurations consisting of hundreds of RSMLT subnets, the fail-over time is still well within the 3 to 5 second boundary before applications and sessions start timing out. © Nortel External Distribution 19 SMLT and RSMLT Deployment Guide Version 1.1 April 2006 Contact Us: For product support and sales information, visit the Nortel Networks website at: http://www.nortel.com In North America, dial toll-free 1-800-4Nortel, outside North America dial 987-288-3700. © Nortel External Distribution 20