SMLT and RSMLT Deployment Guide V1.1

Transcription

SMLT and RSMLT Deployment Guide V1.1
> BUSINESS MADE SIMPLE
SMLT and RSMLT Deployment
Guide
V1.1
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Copyright © 2004 Nortel Networks. All rights reserved.
NORTEL NETWORKS CONFIDENTIAL: The information contained in this document is the property
of Nortel Networks. Except as specifically authorized in writing by Nortel Networks, the holder of this
document shall not copy or otherwise reproduce, or modify, in whole or in part, this document or the
information contained herein. The holder of this document shall keep the information contained
herein confidential and protect same from disclosure and dissemination to third parties and use same
solely for the training of authorized individuals.
Information subject to change without notice.
Nortel, the Nortel logo and Secure Router are trademarks of Nortel Networks.
© Nortel
External Distribution
2
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Table of Contents
INTRODUCTION.................................................................................................................... 4
BACKGROUND..................................................................................................................... 5
Spanning Tree and 802.1w ............................................................................................................5
Equal Cost Multi-Path (ECMP) ......................................................................................................6
Link Aggregation (802.3ad) ...........................................................................................................7
SPLIT MULTI-LINK TRUNKING (SMLT) WITH VRRP ......................................................... 8
VRRP Engineering with SMLT.......................................................................................................9
Backup-Master Operation (Dual Active IP Gateways) ..............................................................10
Performance and Engineering ....................................................................................................11
ROUTED SPLIT MULTI-LINK TRUNKING ......................................................................... 12
Layer 2 Access Configuration.....................................................................................................12
Layer 3 Configuration ..................................................................................................................13
Deployment Scenarios.................................................................................................................14
At the network edge ................................................................................................................... 14
Between Distribution and Core Layer ........................................................................................ 16
Within the Core .......................................................................................................................... 16
BGP Peering .............................................................................................................................. 16
Performance and Engineering ....................................................................................................17
DESIGN RECOMMENDATIONS......................................................................................... 18
General Engineering ....................................................................................................................18
Layer 2 Resiliency Engineering ..................................................................................................18
Layer 3 Resiliency Engineering ..................................................................................................19
Fail-over Performance .................................................................................................................19
© Nortel
External Distribution
3
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Introduction
The ERS8600 Design Guide (document number 31397-D) provides preliminary guidelines and
design scenarios for RSMLT and VRRP, and the purpose of this guide is to complement the
information in the ERS8600 Design Guide by providing an in-depth look at RSMLT and VRRP;
including performance, scalability, design examples, and engineering guidelines and
recommendations.
This guide also covers some of the typical comments and questions in regards to SMLT/RSMLT,
such as:
•
•
•
•
RSMLT performance versus ECMP
Standardization of SMLT
802.1w performance versus SMLT
Preventing loops with SMLT and RSMLT
This guide assumes an audience with a strong understanding of LAN Switching technologies and
protocols as well as a general understanding of Nortel’s Split-Multi-Link Trunk (SMLT) solution.
© Nortel
External Distribution
4
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Background
Network resiliency is a top consideration when designing an IP network. Many customers demand
5x9s network availability because they have migrated, and/or are in the process of migrating mission
critical applications including VoIP, financial, surveillance, etc. to their IP network.
Several key protocols and technologies exist which are deployed to design network resiliency in
Local Area Networks including Spanning Tree Protocol (STP), Virtual Router Redundancy Protocol
(VRRP), Equal Cost Multi-Path (ECMP) routing, and Link Aggregation/Multi-Link Trunking (MLT).
To make an informed decision on what protocol(s) to use, it is important to have an understanding of
how and why each of these protocols were developed, what issue(s) they address, and how they
have evolved.
Note: The Nortel LAN Switching portfolio supports all of the above mentioned technologies and
protocols.
When it comes to network availability from a business perspective, the critical factor is end-user
uptime and not network uptime. Network resiliency plays a big role, as 1 second can make a
significant difference when it comes to end-user availability. For example, if there is a link failure in
the network and it takes 3 seconds for the user traffic to reroute to other links in the network, the enduser of a VoIP conversation, as an example, will see a 3 to 4 second interruption to the conversation.
However if it takes 5 seconds for the traffic to reroute to other links in the network, then the VoIP
session will very likely get disconnected, the IP phone will have to re-register with the server, and the
end-user will have to manually re-establish the phone call. The same could apply to a data session
whereby the user can lose data entry and has to manually re-establish communication with the
application/server and re-enter lost data. Effectively, the 2 extra seconds of network downtime can
result in an exponential increase in end-user downtime.
Industry best practices are such that a network should be designed with traffic rerouting capabilities
of less than 3 to 5 seconds. This is the typical tolerance level for many IP applications.
Spanning Tree and 802.1w
The 802.1D Spanning Tree Protocol (STP) standard was developed many years ago with one main
focus; prevent bridging loops in the LAN. However there are three fundamental problems with
Spanning Tree:
1. Slow Convergence
Depending on the size of the network, if there is a link or a switch failure, it can take the
network several minutes to converge.
2. Idle bandwidth
To prevent bridging loops, STP forces some links in the network to go into a blocked/unused
state; hence resulting in much of the network bandwidth sitting idle. In some cases, more
than 30% of the bandwidth in the network can be idle.
3. Design Complexity
The introduction of multiple Spanning Tree Groups (STG) allows network engineers to
minimize the amount of idle bandwidth in the network by configuring multiple STGs and
controlling which links are blocked for each and associated traffic. This introduces significant
complexity as each STG requires the selection of a root bridge. In addition, link costs must
© Nortel
External Distribution
5
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
be adjusted as it is the only means of engineering a deterministic solution. Finally, the most
difficult challenge is to ensure traffic is balanced across the network links.
Several years after STP was widely deployed, some vendors implemented proprietary extensions to
the STP protocol to achieve faster convergence; an example of this is the FastStart feature on the
Nortel switching products. More recently however, the 802.1w Rapid Spanning Tree (RSTP)
standard was developed, to address the issue of slow convergence with legacy STP.
802.1w is based on the same foundation and architecture as the legacy 802.1D STP protocol, but it
provides faster convergence. RSTP convergence is typically sub-second when a link fails and/or
recovers; however there are several failure scenarios which can result in outages in excess of the 5
second timeout boundary for most IP applications. Two such failures include the failure of the root
bridge and a far-end link failure.
From a bandwidth perspective, 802.1w has the same issue as legacy Spanning Tree whereby many
of the links go into a blocking state in order to prevent bridging loops. To make better use of the
bandwidth in the network, the 802.1s (Multiple Spanning Tree Groups) standard can be used;
however this is similar to the legacy STG implementation whereby the design gets very complex with
challenging administration and operation.
Nortel recommends that Spanning Tree and its associated protocols (802.1w and 802.1s) be avoided
within the Distribution and Core layers however Nortel does recommend using the Spanning Tree
protocol on all end station Access connections to safeguard the network from hubs or other devices
that could be inserted into the network at the end station. A modification to the normal learning of
spanning tree is employed in all Nortel edge switches. This feature is known as Fast Start or Fast
Learning, and is the recommended setting for all end station Access ports. Never enable Fast
Start/Learning on any uplink ports; this will cause loops in the network and therefore could have
unexpected affects on the entire network.
Equal Cost Multi-Path (ECMP)
Following the implementation of basic bridged/STP networks came the development of the routing
protocols including RIP, OSPF, and BGP. This was a big leap in terms of scalability, resiliency,
optimization, simplification, etc however these protocols were unable to load-share traffic across
multiple links resulting in inefficient use of bandwidth. All routing protocols have knowledge of all
paths in the network including redundant paths however the original selection criteria was such that
only the best route would be populated in the routing table.
In addition, although routing convergence may be faster than Spanning Tree and is less disruptive it
still does not address the sub 5 second requirement for mission critical applications. Equal Cost
Multi-Path (ECMP) addresses many of these issues and has wide support across most products and
vendors.
ECMP effectively allows multiple routes of same cost but with different next-hops to exist
simultaneously in the routing table. Traffic is load-shared across the multiple paths automatically via
a hashing algorithm. This inherently provides ECMP with improved failover capabilities (sub second)
as multiple active routes already exist in the event of a failure.
Note: Additional extensions to some ECMP implementations provide non-symmetrical load-balancing
capabilities.
ECMP is widely deployed and does provide effective load-balancing and resiliency however there
are several issues that must be considered including:
© Nortel
External Distribution
6
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
1. Inability to extend L2 Subnets
In many enterprise customer networks, especially those which have some non-IP
applications such as DecNet, SNA, proprietary applications, etc., it is essential to be able to
extend L2 subnets across certain parts of the network, or even across the entire network. It
is anticipated that these non-IP based applications will exist for quite some time.
Implementing ECMP forces the use of layer 3 routing between devices; hence it will not
allow extension of L2 subnets.
2. Scalability
ECMP impacts the size of the routing table significantly. For example, if ECMP is used and
there are 4 links in the ECMP link-group then every destination will have 4 entries in the
routing table, quadrupling the size of the routing database; hence, requiring more memory
and more processing resources.
3. Long Convergence for Specific Failure Scenarios
Although ECMP convergence is typically sub-second, if there is a far-end link failure
(common in LAN extension services offered by service providers, or found in optical
connections where Far End Fault Indication (FEFI) and/or Remote Fault Identification (RFI))
rerouting of the traffic to other links in the ECMP-group will depend on the convergence of
the unicast routing protocol. Typical reconvergence times for RIP and OSPF are 180 and 60
seconds respectively.
4. No Switch Redundancy
For network environments requiring 5x9s availability, switch redundancy is a must, especially
in the core of the network. Unfortunately the ECMP standard does not allow the links in an
ECMP link-group to span across multiple switches i.e. the end-point of all the links in an
ECMP link-group must terminate on the same switch.
Link Aggregation (802.3ad)
The 802.3ad Link Aggregation Protocol standard provides a mechanism to aggregate multiple links
to form one logical link. This Multi-Link Trunking function (MLT) provides both improved bandwidth
beyond that of one link and inherent resiliency. Traffic is load-balanced automatically across the
aggregated links using a deterministic hash. Any failure of a member link within the aggregate trunk
will immediately be detected and removed from the hash resulting in sub-second failure. Link
aggregation sits between layer 1 and 2 making it transparent to layer 2 and layer 3 traffic.
As the 802.3ad standard was being developed many vendors were implementing their own
proprietary form of MLT protocols in parallel. However many of the proprietary MLT implementations
can interoperate in “static mode” whereby there is no end-to-end handshaking between the far-ends
of the MLT links. As long as the MLT is treated as a single logical link, it is largely irrelevant how the
hashing is performed. The two endpoints of the MLT can implement different hashing algorithms.
802.3ad does not really specify a “static mode” implementation. 802.3ad uses Link Aggregation
Control Protocol (LACP) which is responsible for end-to-end handshaking between the two ends of
the trunk. LACP is responsible for many functions; including link ordering, assigning active and
standby links, detecting far-end link failures, etc. From a traffic load-sharing perspective, vendors
deploy different hashing algorithms. The most common hashing algorithm is based on an XOR/MOD
of certain fields in the source/destination IP and/or MAC addresses. Some vendors deploy loadsharing and round-robin hashing algorithms whereby traffic load-sharing is done on a packet-bypacket basis. The advantage of these hashing algorithms is that they result in a much better balance
of traffic load across the links in a link-group; however implementing such hashing algorithms could
cause out-of-sequence packets, leading to serious connectivity problems at the session layer.
© Nortel
External Distribution
7
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
802.3ad has gained momentum and is currently widely deployed and supported. However 802.3ad
has one major shortfall; it does not have any provisions to allow the links in the Link Aggregation
Group to span more than one end-device (switch/server). As a result, designing networks for 5x9’s
availability, where switch redundancy is required, becomes a challenge that likely requires a
combination of 802.3ad and spanning tree or ECMP.
To address this gap and provide an ultimate resiliency solution in terms of simplicity, scalability,
convergence, Nortel developed an extension to the Link Aggregation standard. This extension is
called Split-MLT (SMLT) and will be described in subsequent sections. SMLT is based on the
802.3ad standard and interfaces transparently to any vendor switch or sever, regardless of whether
they support the 802.3ad standard or their own proprietary form of static MLT implementation. SMLT
provides unparallel resiliency by allowing the links in the 802.3ad/MLT trunk to span more than one
switch. SMLT has been generally available for several years now and is widely deployed by very
large enterprise customer deployments in the financial, education, transportation, and healthcare
industries.
Split Multi-Link Trunking (SMLT) with VRRP
SMLT is currently supported on the Nortel Ethernet Routing Switch 8600 as well as ERS 1600 &
5500 series and will be supported on other Ethernet Routing Switches such as ERS 8300 in future
releases. SMLT logically aggregates two Ethernet Routing Switch nodes to form one logical entity
known as a Switch Cluster (SC). The two peer nodes in a SC are connected via an Inter-Switch
Trunk (IST). The IST is used to exchange forwarding and routing databases between the two peer
nodes in the Switch Cluster. SMLT is self-contained with SMLT handshaking occurring within the
Switch Cluster only. SMLT is completely transparent to access switches which connect to the Switch
Cluster using 802.3ad standard trunking, or using their own proprietary static MLT implementation.
The diagram below illustrates both the physical and logical connectivity in a typical SMLT
implementation.
Figure 1: Split Multi-Link Trunking (SMLT) Concept
SMLT provides a simple, scaleable, active-active design, whereby all the links and all switches are
actively forwarding traffic. Traffic is automatically balanced across the links and nodes using a
standard hashing algorithm eliminating any packet duplication. It uses a light weight protocol
© Nortel
External Distribution
8
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
requiring very little CPU resources; and in the event of a link failure or a complete switch failure,
provides sub-second recovery.
Since SMLT load-balances all resilient paths using a deterministic hash, it eliminates any packet
duplication hence eliminating the need for spanning tree.
SMLT uses a Loop-Detect feature and a CP-Limit feature to detect and contain bridged loops in an
SMLT network; refer to the ERS8600 Design Guide for details.
VRRP Engineering with SMLT
From a Layer-2 perspective, if any of the links in the SMLT trunk or IST, or if any one of the peer
nodes in the Switch Cluster fails, traffic is switched in sub-seconds to the other links/switches. While
this provides layer 2 resiliency from a layer 3 perspective each 8600 must have an associated IP for
routing (assuming a layer 3 design) in each active VLAN configured with SMLT. Nodes on a VLAN
using SMLT will typically use a single default gateway pointing to only one of the two 8600s in the
SC. As such VRRP is used to provide a common default gateway address. The diagram below
provides an illustration of how traffic is forwarded with SMLT/VRRP.
Figure 2: SMLT Traffic Forwarding with VRRP
In the above example, the top Access switch relies on the MLT hash to select the link towards the
Switch Cluster. Since the selected link is the VRRP Backup, the packets must be switched by the
Backup to the Master via the IST. Further, if the bottom Access switch had only a single link
connecting to the left Distribution switch the packets would need to traverse the IST again. This
results in unnecessary bouncing of traffic causing extra bandwidth consumption and delay. To
optimize traffic switching/routing, Nortel uses an extension to the VRRP standard, called VRRP
Backup-Master.
© Nortel
External Distribution
9
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Backup-Master Operation (Dual Active IP Gateways)
The IST allows the 2 devices in the SC to exchange all MAC address tables. When VRRP BackupMaster is enabled on the previous “Backup only” peer node, it allows that switch to forward the traffic
to the destination directly on behalf of its peer, even though it is operating as a VRRP backup. The
diagram below provides an illustration of how packets flow in the case where VRRP Backup-Master
is enabled.
Figure 3: SMLT Traffic Forwarding with VRRP Backup-Master
Given the same example described earlier, if VRRP Backup-Master is enabled on the left distribution
switch, when the packet is hashed through the Access switch MLT, the Backup-Master will forward
the packets directly to the destination.
Since VRRP Backup-Master is simply an extension to the standard VRRP protocol, control traffic is
exchanged using multicast traffic; hence the virtual IP address can span the entire L2 network and
consequently the VRRP Backup-Master feature will work anywhere along the entire subnet. For
example, in an SMLT square design (refer to the ERS8600 Design Guide for details on SMLT-square
designs), the virtual IP address can span all 4 nodes in the SMLT-square core whereby one of the
nodes is the VRRP Master and the other 3 nodes are Backup-Master. This provides optimal
switching/routing paths within the SMLT-square and eliminates the need for multiple VRRP instances
hence will simplify the network design, implementation, and operations significantly.
By default, VRRP control information is exchanged every 3 seconds; however with the
implementation of Backup-Master, this timer is irrelevant because the VRRP backup can forward
traffic when Backup-Master is enabled. It is recommended that VRRP Backup-Master is used with
SMLT configurations to optimize traffic flow and fail-over time. However for customers not wishing to
use SMLT and VRRP Backup-Master, Nortel has introduced the concept of VRRP-Fast whereby the
user can adjust the VRRP timers to provide sub-second convergence for VRRP.
© Nortel
External Distribution
10
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Performance and Engineering
VRRP control traffic typically consumes about 4kbps of bandwidth for every 50 VRRP instances
configured. As such even at maximum scalability of 255 VRRP instances on the ERS8600 there is
minimal control traffic (i.e. approximately 20kbps).
Enabling VRRP-Fast with default timers of 200ms effectively increases this control traffic by 5 times.
This is expected as the default timers for VRRP with VRRP-Fast disabled are 1 second.
Nortel recommends that Spanning Tree and its associated protocols (802.1w and 802.1s) be avoided
within the Distribution and Core layers however Nortel does recommend using the Spanning Tree
protocol on all end station Access connections to safeguard the network from hubs or other bridging
devices that could be inserted into the network at the end station. A modification to the normal
learning of spanning tree is employed in all Nortel edge switches. This feature is known as Fast Start
or Fast Learning, and is the recommended setting for all end station Access ports. Never enable
Fast Start/Learning on any uplink ports; this will cause loops in the network and therefore could have
unexpected affects on the entire network.
When using SMLT to connect the Access to the Distribution/Core, always disable Spanning Tree on
the uplink ports/MLT of the edge switch.
© Nortel
External Distribution
11
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Routed Split Multi-Link Trunking
Routed-SMLT (RSMLT) is an enhancement to SMLT enabling the exchange of Layer 3 information
between peer nodes in a Switch Cluster for unparalleled resiliency and simplicity for both L3 and L2.
Layer 2 Access Configuration
In network designs where Layer-2 access/closet switches connect to a Switch Cluster, VRRP is
typically used with Backup-Master as previously descried. In this type of L2 access design, RSMLT
can be used to provide the same function as VRRP with Backup-Master. This is done simply by
enabling RSMLT on the VLAN/subnet on both peer nodes and using one of the IP addresses of
either peer node as the default gateway. The diagram below provides an illustration of how RSMLT
works in a Layer-2 access configuration.
Figure 4: RSMLT for Layer 2 Access
In the above example, hosts are configured with a default gateway pointing to 10.1.1.1. The MLT
hash at the Access switch however forwards the packets towards the left Distribution switch
(10.1.1.2). Since both nodes are configured as RSMLT peers, 10.1.1.2 will forward the packets
directly to the destination on behalf of its peer.
© Nortel
External Distribution
12
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Layer 3 Configuration
The primary reason for developing RSMLT was to address Layer 3 implementations. The below
figure shows a typical example of this type of configuration whereby the Distribution switch connects
to the Switch Cluster via SMLT with OSPF enabled between the Distribution switch and the Core
switch. From the Distribution switch perspective there will be 2 OSPF adjacencies formed; one with
10.1.1.1 and a second with 10.1.1.2. OSPF will select one of the 2 Core switches as the next-hop for
upstream traffic (i.e. 10.1.1.1) however the packets may be hashed by the MLT to the left Core
switch (10.1.1.2). Without RSMLT, packets would have to be sent across the IST trunk to the
second Core switch (i.e. 10.1.1.1). As with SMLT it is possible that the destination switch resides off
only the left Core switch. As such packets would need to traverse the IST a second time. This
example shows the worst-case scenario of what would happen without RSMLT.
Figure 5: Generic Layer 3 Connectivity
The diagram below illustrates how RSMLT addresses the issue described above. In this case, when
RSMLT is enabled on the VLAN/subnet 10.1.1.x, as the packets are hashed by MLT at the
Distribution switch towards the left Core switch, the packets are forwarded directly.
© Nortel
External Distribution
13
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Figure 6: RSMLT for Layer 3 Connectivity
Note: RSMLT provides full resiliency eliminating the need for ECMP and/or any reliability provided
traditionally with a routing protocol. The decision to deploy dynamic routing should therefore be
based solely on the desire to provide autonomous route table population.
Deployment Scenarios
RSMLT provides the ultimate resiliency solution in terms of flexibility, simplicity, and performance.
This section highlights a few common networking examples of where RSMLT can be deployed;
including:
•
•
•
•
At the network edge (regardless of whether the edge/closet switches are L2 or L3 switches)
Between the distribution layer and the core layer
Within the core of the network
BGP peering with other networks
At the network edge
In general Layer 3 should be deployed at the edge in a large campus environment with many enduser devices, with Layer 2 in smaller network environments. There is no clear-cut boundary in terms
of how many end-devices would require L2 versus L3 switching in the closet; however the suggested
number is about 3000 users. In large campus environments, L3 may be more expensive but it will
scale much better and will provide an architecture that is easy to manage and operate. If the number
of users is less than 3000, and there is no compelling reason to implement L3, Layer 2 switches in
the Access will minimize the network cost and simplify the network design. The diagram below
provides an illustration of where RSMLT would be deployed at the edge of the network regardless of
whether the access layer is L2 or L3.
© Nortel
External Distribution
14
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Figure 7: General SMLT/RSMLT Architecture
The Access switches connect to a Distribution layer which consists of an SMLT Switch Cluster. If the
Access switches are Layer 2 switches, then the Switch Cluster in the aggregation layer can either
use VRRP with Backup-Master or RSMLT. RSMLT provides a simple, more scalable solution than
VRRP.
The Access VLANs/subnets terminate on the Distribution Switch Cluster(s); with each subnet
assigned an IP address on the SMLT peer nodes. RSMLT is enabled on the peer nodes for all the
user subnets. The gateway for the end-user devices can be the IP address of either Distribution
switch for that respective subnet.
Note: The default RSMLT hold timer is 180 seconds, which is designed for interconnecting to Layer 3
switches; however when connecting Layer 2 switches to an RSMLT Switch Cluster, the hold-up timer
must be configured to 9999, which will allow a node in the Switch Cluster to forward traffic indefinitely
on behalf of its peer.
If the Access switches are Layer 3 switches, there are two ways to use RSMLT. The first relies on
dynamic routing between the Access switches and the Distribution Switch Cluster. For such
configurations, the default hold-up timer (180 seconds) does not have to be changed because in the
© Nortel
External Distribution
15
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
event that one of the Switch Cluster nodes in the Distribution layer fails, the routing protocol will
remove that neighbor from the routing tables within 180 seconds (worst-case with RIP).
An alternative approach relies on statically defined routes. In this scenario, RSMLT must be
configured the same as it would be for a L2 access configuration (i.e. RSMLT enabled on both
Distribution peer nodes for the routing subnet with hold timer configured to 9999 (infinity)).
Regardless of which L3 solution is used, with RSMLT any VLAN/subnet can be extended L2
throughout the entire campus infrastructure. Also, only a single entry per path is present in the
routing table, unlike ECMP which relies on multiple routing entries for every destination.
Between Distribution and Core Layer
For very large campus designs (i.e. those consisting of 5000+ users) it is highly recommended to
design a 3-tier architecture as it will enable better scalability and overall management of the network.
In a typical 3-tier design which consists of a Core layer, Distribution layer, and an Access layer, the
connection between the Core and Distribution is almost always routed (Layer 3). The use of RSMLT
between these layers provides the ultimate resiliency solution.
From an implementation perspective, typically the Distribution layer consists of a series of SMLT
Switch Clusters, which connect to an SMLT Switch Cluster in the Core, essentially forming an SMLTsquare configuration. In such a configuration, a routing subnet (typically OSPF) is configured
between the layers, forming fully-meshed OSPF neighbors in the SMLT-square. RSMLT is enabled
on the Distribution Switch Clusters as well as on the Core Switch Cluster for the routing subnet. The
default RSMLT hold-up timer (180 seconds) does not have to be changed as OSPF will remove that
any failed interfaces from the routing tables well before 180 seconds.
Within the Core
A typical campus Core consists of an SMLT-square with one Core Switch Cluster connecting to the
user Distribution layer and the other Switch Cluster connecting to the server Distribution layer. In
some cases, the Switch Clusters in the Core may be used to serve two different campuses or two
different Data Centers. In such a configuration, the users and/or servers are typically connected to
different subnets, and it is not necessary to extend the subnets across the entire SMLT-square.
Rather, it is recommended to keep the subnets isolated for security reasons and management. In
such network designs, RSMLT presents an ideal solution.
From an implementation and operations perspective, this is very similar to the RSMLT configuration
between the Distribution and the Core layer previously described whereby a routing subnet (usually
OSPF) is configured between the two Switch Clusters in the Core forming a fully-meshed OSPF
design in the SMLT-square. RSMLT is then enabled on the peer nodes in both Switch Clusters to
provide optimum routing with sub-second resiliency.
BGP Peering
RSMLT can also be implemented very effectively when doing BGP peering between two networks.
BGP convergence can be extremely demanding on CPU resources and may take significant time to
converge, especially when exchanging full Internet routing tables. The diagram below provides an
illustration of this type of a configuration.
© Nortel
External Distribution
16
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Figure 8: Resilient BGP Peering with RSMLT
From an implementation perspective, the ISP provides a standard 802.3ad trunk to the enterprise
network, which terminates on an SMLT Switch Cluster. A BGP subnet is configured between the ISP
and the enterprise customer with RSMLT enabled on the peer nodes in the Switch Cluster. The
benefit of this configuration is that the 802.3ad trunk provides a logical connection between the ISP
and the enterprise network, and the RSMLT peering provides a logical aggregation of the Core
nodes. Since RSMLT provides all the resiliency required loss of a BGP peer or changes in BGP
state due to local link and switch failures is virtually eliminated resulting in sub-second failover and
minimizing long failover delays associated with BGP convergence.
Performance and Engineering
RSMLT uses a very light weight protocol to exchange database information between the peer nodes;
hence the amount of RSMLT control traffic is negligible at around 200 bytes/second regardless of the
number of VLANs configured.
There are no architectural nor configuration limits defined for RSMLT. A user can configure as many
RSMLT instances on the ERS8600 as there are IP addresses. The default limitation is 500 IP
instances; however with the MAC address extension kit, the number of IP instances goes up to 1980.
Since RSMLT is a light-weight protocol, it is much less intensive on CPU resources than VRRP;
hence it is much more scalable than VRRP.
© Nortel
External Distribution
17
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Design Recommendations
Many of the generic design guidelines for SMLT, VRRP and RSMLT, such as the number of MLT
groups, number of links per group, timers, CP-limit option, STP interaction, SLPP operation etc. are
captured in the ERS8600 design guide. Please refer to this directly for this type of information.
General Engineering
The following are some general engineering considerations and recommendations for SMLT and
RSMLT deployments:
•
The uplinks out of a wiring closet must utilize DMLT whenever possible by terminating each
of the separate physical links on different switches within the stack or different modules
within the chassis.
•
Nortel recommends disabling Spanning Tree protocol on MLT/DMLT group and ports on
both Access and Distribution/Core switches. This is absolutely required when using Split
Multi-Link Trunking (SMLT) in the Distribution and Core layer.
•
Enable Loop Detect on SMLT ports for added protection. Do not enable Loop Detect on the
IST ports. Be sure to enable Loop Detect with the action of port down as opposed to the
action of VLAN down. This feature disables the port where a MAC incorrectly shows up
(looping port) due to MAC flapping between the correct port and the looping port. Note
Simple Loop Prevention Protocol (SLPP) in release 4.1 will provide this functionality.
Layer 2 Resiliency Engineering
Layer 2 is typically used between the Access and Distribution layers in the case of a 3-tier network
design; or between the Access and Core layers in the case of a two-tier network design. Both
options of SMLT with VRRP and RSMLT exist. The following are some guidelines to be considered
when deciding on whether to use RSMLT or SMLT with VRRP.
•
The maximum supported VRRP instances per node is 255. Scaling VLANs/subnets beyond
this requires RSMLT.
•
RSMLT does not allow the configuration of a virtual IP address. From an operational and
technical perspective, RSMLT provides the same functionality as VRRP, and either of the
RSMLT peer node addresses would provide the same function as the VRRP virtual IP
address.
•
Customers using static IP addresses for servers and hosts wishing to extend the virtual IP
address (essentially the default gateway of the end-user devices) across more than just the
peer nodes in a Switch Cluster (for example; across an SMLT-square configuration) must
use VRRP with Backup-Master. RSMLT only works across the IST in the same SwitchBlock and cannot be extended.
•
When using VRRP with Backup-Master, it is highly recommended to load-balance the VRRP
Master and Backup-Master instances across the peer nodes in a Switch Cluster as the
Backup-Master end of the VRRP requires more processing power.
© Nortel
External Distribution
18
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
•
If there are less than 255 subnets at the edge of the network, either VRRP with BackupMaster or RSMLT can be used:
o From an administrative/configuration perspective, RSMLT is much easier to
configure. VRRP requires a VRRP instance on each of the switches in the Switch
Cluster, and the appropriate priorities assigned to define the master and backup
ends of the VRRP.
o If the SMLT nodes are running services and protocols which demand a high amount
of CPU resources, (i.e. large OSPF routing tables, STP, etc) it is recommended to
use RSMLT. It is recommended to use RSMLT if the number of subnets is more
than 150.
•
When using VRRP with an SMLT configuration, Nortel highly recommends the use of VRRP
Backup-Master. However if the customer environment does not allow any non-standard
implementation nor extensions to the standard, such as SMLT and VRRP Backup-Master,
the customer can enable VRRP-Fast to expedite the fail-over time from 3 seconds to subsecond. However VRRP-Fast demands much more CPU resources and is not recommended
for more than 100 subnets per switch if that switch is running other services (OSPF, RIP,
STP).
•
For Layer 2 access configurations using RSMLT, the hold-up timer should be configured to
9999 to allow the node in the Switch Cluster to forward traffic indefinitely on behalf of its
peer.
Layer 3 Resiliency Engineering
In a Layer 3 environment either ECMP or RSMLT can be deployed for resiliency. RSMLT requires
only a single entry per route and is much less demanding on CPU and memory resources for
calculating and storing routing tables. Also, since the RSMLT trunk behaves as one logical link, and
the nodes in the RSMLT Switch Cluster behave as one logical switch, routing convergence will never
take place in the event of a link failure or even a switch failure. Finally, RSMLT allows Layer 2
subnets to be transparently extended over the RSMLT trunk for applications and environments
where this is required.
•
If dynamic routing is enabled across the RSMLT trunk, the default hold timer of 180 seconds
should be sufficient to allow the worst-case routing protocol convergence (RIP) to converge.
•
If static default routes are used over the RSMLT trunk, the hold timer must be configured for
9999 on the RSMLT peers.
•
When running RSMLT, ECMP should be disabled as it will result in additional unnecessary
entries in the routing table.
Fail-over Performance
There is no difference in terms of fail-over capabilities between SMLT and RSMLT. In the event of
any link failure in the IST or RSMLT trunks or in the event of a failure of any switch in the SMLT
Switch Cluster, the traffic will reroute to other links/switches in less than 1 second. Even in very large
configurations consisting of hundreds of RSMLT subnets, the fail-over time is still well within the 3 to
5 second boundary before applications and sessions start timing out.
© Nortel
External Distribution
19
SMLT and RSMLT Deployment Guide
Version 1.1
April 2006
Contact Us:
For product support and sales information, visit the Nortel Networks website at:
http://www.nortel.com
In North America, dial toll-free 1-800-4Nortel, outside North America dial 987-288-3700.
© Nortel
External Distribution
20