Sample Routers and Switches High Capacity Router Routers in a Network Router Design
Transcription
Sample Routers and Switches High Capacity Router Routers in a Network Router Design
Routers in a Network • Overview of Generic Router Architecture . . . . . . Router Design • Input-Queued Switches (Routers) • IP Address Look-up Algorithms • Packet Classification Algorithms BISS 2010: FAN 1 Sample Routers and Switches BISS 2010: FAN 2 High Capacity Router • Cisco CRS-1 Cisco 12416 Router up to 160 Gb/s throughput up to 10 Gb/s ports – up to 46 Tb/s thruput • two rack types – 640 Gb/s thruput – up to 16 line cards • up to 40 Gb/s each – up to 72 racks Juniper Networks T640 Router up to 160 Gb/s throughput up to 10 Gb/s ports BISS 2010: FAN 3Com 4950 24 port gigabit Ethernet switch 3 BISS 2010: FAN 4 Generic Router Architecture Components of a Basic Router • Input/Output Interfaces (II, OI) II IPP CP OPP Data Hdr – synchronize signals – determine required OI or OIs from routing table • Output Port Processor (OPP) – queue outgoing cells • shared bus interconnects IPPs and OPPs Header Processing Lookup IP Address Update Header 1 1 Header Processing Lookup IP Address Update Header 2 2 NQueue times line Packet rate Buffer Memory Address Table Control Queue Packet Buffer Memory Address Table Data Hdr . . . • Input Port Processor (IPP) OI output queue routing table . . . – convert between optical signals and electronic signals – extract timing from received signals – encode (decode) data for transmission N times line rate Processor (CP) » configures routing tables » coordinates end-to-end channel setup together with neighboring routers BISS 2010: FAN 5 Point-to-Point Switch (3rd Generation) Data Hdr Header Processing Lookup IP Address Address Table Update Header N N Queue Packet Buffer Memory BISS 2010: FAN 6 Buffer Placement: Output Port Queuing Switched Backplane Line Card CPU Card Line Card Local Buffer Memory Routing Table Local Buffer Memory Fwding Table Fwding Table MAC MAC • Buffering when the aggregate arrival rate exceeds the output line speed • Memory must operate at very high speed Typically < 50Gbps aggregate capacity BISS 2010: FAN 7 BISS 2010: FAN 8 Switching Speed-up Needed Data Hdr Header Processing Lookup IP Address Update Header 1 1 Header Processing Lookup IP Address Queue Packet Link 1, ingress Buffer Memory Address Table Data Hdr Simple model of output queued switch Update Header 2 2 Link rate, R Link 2, ingress R Buffer Memory Link 3, ingress N times line rate Data Hdr Header Processing Lookup IP Address Address Table Update Header N N Link rate, R NQueue times line Packet rate Address Table Link 1, egress R Queue Packet Link 4, ingress R Buffer Memory BISS 2010: FAN 9 Characteristics of an output queued (OQ) switch Link 2, egress R Link 3, egress R Link 4, egress R BISS 2010: FAN 10 Buffer Placement: Input Port Queuing • Fabric slower than input ports combined – So, queuing may occur at input queues • Head-of-the-Line (HOL) blocking – Queued packet at the front of the queue prevents others in queue from moving forward • arriving packets immediately written into output queue, without intermediate buffering • flow of packets to one output does not affect flow to another output • OQ switch is work conserving: output line always busy when there is a packet in switch for it • OQ switch has highest throughput, lowest average delay BISS 2010: FAN 11 BISS 2010: FAN 12 Simple model of input queued switch Link 1, ingress Link 1, egress R Link 2 Link 1 R1 Link 2, ingress R • Packet at the head of an input queue cannot be transferred, thus blocking the following packets (or cells – packets of fixed size) Cannot be transferred because is blocked by red packet Link 2, egress R Link 3 Head-of-line Blocking R Input 1 Link 3, ingress R Link 4 Link 4, ingress Output 1 Link 3, egress Input 2 R Link 4, egress R Input 3 R BISS 2010: FAN 13 Characteristics of an input queued (IQ) switch Output 2 Cannot be transferred because output buffer full Output 3 BISS 2010: FAN 14 Buffer Placement: Design Trade-offs • Output queues – Pro: work-conserving, so maximizes throughput – Con: memory must operate at speed N*R • Input queues – Pro: memory can operate at speed R – Con: head-of-line blocking for access to output • arriving packets written into input queue • only one packet can be sent to output link at a time • head-of-line blocking • IQ switch cannot keep output links fully utilized BISS 2010: FAN • Work-conserving: output line is always busy when there is a packet in the switch for it • Head-of-line blocking: head packet in a FIFO cannot be transmitted, forcing others to wait 15 BISS 2010: FAN 16 Model (cont’d) What is capacity of IQ: Model [optional: Karol et al Globecom’86] • At+1 - no. of new HOL packets in input ports with destination O • Xt+1 = (Xt-1)+ + At+1 •Large input-queued switch with – single FIFO at each input – packet destinations i.i.d. (independently, identically distributed), uniform across outputs – HoL blocked packets not flushed • where •throughput analysis – saturated switch (i.e., always arrival at each input queue) – focus on one output port O – Xt - number of packets that did not get to O at end of slot t – Dt- number balls removed from inputs port at the end of t – Dt is switch thruput BISS 2010: FAN D k D −k P( At +1 = k ) = t (1 / N ) (1 − 1 / N ) t k • E(Dt) = ρN where ρ is output throughput • for large N, binomial distribution can be approximated by Poisson distribution, P( At = k ) ≈ BISS 2010: FAN 17 ρk k! e−ρ 18 A Router with Input Queues Model (cont’d) Head of Line Blocking E ( A2 ) + EA − 2( EA) 2 EX = 2(1 − EA) where EA = ρ, E(A2) = ρ + ρ2 therefore EX = 1, therefore 2ρ − ρ 2 2 (1 − ρ ) Delay EX = 2ρ − ρ 2 1= 2(1 − ρ ) 0% and ρ =2-√2≈ 58.6% BISS 2010: FAN 20% 40% 60% Load 19 BISS 2010: FAN 80% 100% 2 − 2 ≈ 58% 20 Solution to Avoid Head-of-line Blocking • How to improve capacity without increasing switching fabric speed ? • Maintain at each input N virtual queues, i.e., one per output – use non-FIFO scheduler, matching input/output Input 1 Output 1 Output 2 Input 2 Virtual Output Queueing • assume fixed length packets 1 • each input manages separate queue per output • at each time, matching scheduler finds best possible packets from inputs to said to outputs • maximum-weight matching 1 . . . matching scheduler N N . . . Output 3 Input 3 BISS 2010: FAN 21 BISS 2010: FAN Scheduling Algorithms Matching 19 3 4 • Lij(t): no. of packets at input i for output j at t • bipartite graph (V1∪V2,E), E∈V1×V2 1 – V1,V2 inputs, outputs – (i,j) ∈ E iff Lij(t) > 0 • matching: subset of E such that input no two edges are adjacent 22 21 18 7 output 19 19 1 18 7 Practical Maximal Matchings Not stable BISS 2010: FAN 23 BISS 2010: FAN Max Size Matching Not stable Max Wt Matching Stable 24 Switch Algorithms Better Matching Algorithms 19 • 19 – – – 18 1 • 7 Maximal matching Max Size Matching Not stable Max Wt Matching Not stable Need simple algorithms that perform well Stable, low backlogs Better performance efficient packet processing packets at line speeds high throughput low latencies/backlogs Randomized algorithms with linear complexity available – Tassiulas’ Randomized Algorithm – LAURA – SERENA Use both randomization, history, problem structure and arrival information Easier to implement BISS 2010: FAN 25 BISS 2010: FAN Combined Input-Output Queued (CIOQ) Routers • Both input and output interfaces store packets • Advantages input interface OQ Emulation output interface – Easy to built • Utilization 1 can be achieved with limited input/output speedup (<= 2) Backplane • Disadvantages – Harder to design algorithms • Two congestion points • Need to design flow control RO 26 C • Each input and output maintains a preference list • Input preference list: list of cells at that input ordered in the inverse order of their arrival • Output preference list: list of all input cells to be forwarded to that output ordered by the times they would be served in an Output Queueing schedule • Use Gale Shapely Algorithm (GSA) to match inputs to outputs – Outputs initiate the matching • Can emulate all work-conserving schedulers BISS 2010: FAN 27 BISS 2010: FAN 28 Output Queue Emulation using CIOQ (with Speed-up) Example Stable Matching -- Gale Shapely Algorithm (GSA) While there are unmatched output that are not rejected by all input do Each unmatched output requests its most preferred packet from an input that has not rejected it yet Each input grants the request to the output with the most preferred cell • A stable matching exists for every set of preference lists • Complexity: worst-case O(N2) BISS 2010: FAN 29 BISS 2010: FAN 30