Cascaded FICON in a Brocade Environment

Transcription

Cascaded FICON in a Brocade Environment
MAINFRAME
Technical Brief:
Cascaded FICON in a Brocade Environment
Cascaded FICON introduces the open systems SAN concept of the Inter-Switch
Links (ISLs). IBM now supports the flow of traffic from the processor through
two FICON directors connected via an ISL and on to the peripheral devices
such as disk and tape. This paper discusses the benefits and some technical
aspects of cascaded FICON in a Brocade environment.
MAINFRAME
Technical Brief
CONTENTS
Technical Brief: Cascaded FICON in a Brocade Environment...........................................................................................................................................................1
Contents.................................................................................................................................................................................................................................................................2
Introduction...........................................................................................................................................................................................................................................................3
The Evolution from ESCON to FICON Cascading.....................................................................................................................................................................................4
What is Cascaded FICON?.......................................................................................................................................................4
High Availability (HA), Disaster Recovery (DR), and Business Continuity (BC)......................................................................5
Benefits of FICON Cascading .........................................................................................................................................................................................................................7
Optimizing Use of Storage Resources.....................................................................................................................................8
Cascaded FICON Performance................................................................................................................................................9
Buffer-to-Buffer Credit Management ..........................................................................................................................................................................................................9
About BB Credits ....................................................................................................................................................................10
Packet Flow and Credits ........................................................................................................................................................................... 10
Buffer-to-Buffer Flow Control.................................................................................................................................................................... 10
Implications of Asset Deployment.........................................................................................................................................11
Configuring BB Credit Allocations on FICON Directors........................................................................................................................... 12
BB Credit Exhaustion and Frame Pacing Delay...................................................................................................................................... 12
What is the difference between frame pacing and frame latency?...................................................................................................... 14
What can you do to eliminate or circumvent frame pacing delay?....................................................................................................... 14
How can you make improvements? ........................................................................................................................................................ 15
Dynamic Allocation of BB Credits ............................................................................................................................................................ 15
Technical Discussion of FICON Cascading.............................................................................................................................................................................................16
Fabric Addressing Support ....................................................................................................................................................16
High Integrity Enterprise Fabrics ...........................................................................................................................................19
Managing Cascaded FICON Environments and ISLs: Link Balancing and Aggregation .....................................................19
Best Practices for FICON Cascaded Link Management....................................................................................................................................................................22
Terms and Definitions............................................................................................................................................................22
Frame-level Trunking Implementation ..................................................................................................................................22
Brocade M-Series Director Open Trunking ...........................................................................................................................24
Use of Data Rate Statistics by Open Trunking........................................................................................................................................ 26
Rerouting Decision Making ...................................................................................................................................................................... 26
Checks on the Cost Function ................................................................................................................................................................... 27
Periodic Rerouting..................................................................................................................................................................................... 27
Algorithms to Gather Data........................................................................................................................................................................ 28
Summary of Open Trunking Parameters................................................................................................................................................. 29
Fabric Tuning Using Open Trunking......................................................................................................................................................... 30
Open Trunking Enhancements................................................................................................................................................................. 30
Open Trunking Summary .......................................................................................................................................................................... 31
Controlling FICON Cascaded Links in More Demanding Environments ..............................................................................32
Preferred Path on M-Series FICON Switches .......................................................................................................................................... 32
Prohibit Paths ............................................................................................................................................................................................ 33
Traffic Isolation Zones on B-Series FICON Switches ............................................................................................................35
TI Zones Best Practices ............................................................................................................................................................................ 37
Summary ...........................................................................................................................................................................................................................................................38
Appendix: Fibre Channel Class 4 Class of Service (CoS)..................................................................................................................................................................39
Cascaded FICON in a Brocade environment
2 of 40
MAINFRAME
Technical Brief
INTRODUCTION
Prior to the introduction of support for cascaded FICON director connectivity on IBM zSeries mainframes in
January 2003, only a single level of FICON directors was supported for connectivity between a processor
and peripheral devices. Cascaded FICON introduced the open systems Storage Area Network (SAN) concept
of the Inter-Switch Links (ISLs). IBM now supports the flow of traffic from the processor through two FICON
directors connected via an ISL to the peripheral devices, such as disk and tape.
This paper starts with a brief discussion of cascaded FICON, its applications, and the benefits of a cascaded
FICON architecture. The next section provides a technical discussion of buffer–to-buffer credits (BB credits),
open exchanges, and performance. The final section describes management of a cascaded FICON
architecture, including ISL trunking and the Traffic Isolation capabilities unique to Brocade®.
FICON, like most technological advancements, evolved from the limitations of its predecessor—the IBM
Enterprise System Connection (ESCON) protocol—a successful storage network protocol for mainframe
systems considered the parent of the modern SAN. IBM Fiber Connection (FICON) was initially developed to
address the limitations of the ESCON protocol. In particular, FICON addresses ESCON addressing,
bandwidth and distance limitations. FICON has evolved rapidly since the initial FICON bridge mode (FCV)
implementations came to the data center, from FCV to single director FICON Native (FC) implementations,
to configurations that intermix Fibre Channel (FC) and open systems Fibre Channel (FCP), and now to
cascaded fabrics of FICON directors. FICON support of cascaded directors is available, has been supported
on the IBM zSeries since 2003, and is supported on the System z processors as well.
Cascaded FICON allows a FICON Native (FC) channel or a FICON CTC channel to connect a zSeries/System z
server to another similar server or peripheral device such as disk, tape library, or printer via two Brocade
FICON directors or switches. A FICON channel in FICON Native mode connects one or more processor
images to an FC link, which connects to the first FICON director, then dynamically through the first director
to one or more ports, and from there to a second cascaded FICON director. From the second director there
are Fibre Channel links to FICON Control Unit (CU) ports on attached devices. These FICON directors can be
geographically separate, providing greater flexibility and fiber cost savings. All FICON directors connected
together in a cascaded FICON architecture must be from the same vendor (such as Brocade). Initial support
by IBM is limited to a single hop between cascaded FICON directors; however, the directors can be
configured in a hub-star architecture with up to 24 directors in the fabric.
NOTE: In this paper the term “switch” is used to reference a Brocade hardware platform (switch, director, or
backbone) unless otherwise indicated.
Cascaded FICON allows Brocade customers tremendous flexibility and the potential for fabric cost savings
in their FICON architectures. It is extremely important for business continuity/disaster recovery
implementations. Customers looking at these types of implementations can realize significant potential
savings in their fiber infrastructure costs and channel adapters by reducing the number of channels for
connecting two geographically separate sites with high availability FICON connectivity at increased
distances.
Brocade (via the acquisitions of CNT/Inrange and McDATA) has a long and distinguished history of working
closely with IBM in the mainframe environment. This history includes manufacture of IBM’s 9032 line of
ESCON directors, the CD/9000 ESCON directors, the FICON bridge cards, and the first FICON Native (FC)
directors with the McDATA ED-5000 and Inrange FC/9000. Brocade’s second generation of FICON
directors, the legacy McDATA Intrepid 6064, Brocade M6140, Brocade 24000, and Brocade Mi10K are the
foundation of many FICON storage networks. Brocade continues to lead the way in cascaded FICON with the
Brocade 48000 Director and the Brocade DCX Backbone.
Cascaded FICON in a Brocade environment
3 of 40
MAINFRAME
Technical Brief
THE EVOLUTION FROM ESCON TO FICON CASCADING
In 1990 the ESCON channel architecture was introduced as the way to address the limitations of parallel
(bus and tag) architectures. ESCON provided noticeable, measurable improvements in distance capabilities,
switching topologies and, most importantly, response time and service time performance. By the end of the
1990s, ESCON’s strengths over parallel channels had become its weaknesses. FICON evolved in the late
1990s to address the technical limitations of ESCON in bandwidth, distances, and channel/device
addressing with the following features:
•
Increased number of concurrent connections
•
Increased distance
•
Increased channel device addressing support
•
Increased link bandwidth
•
Increased distance to data droop effect
•
Greater exploitation of priority I/O queuing
Initially, the FICON (FC-SB-2) architecture did not allow the connection of multiple FICON directors. (Neither
does ESCON except when static connections of “chained” ESCON directors were used to extend ESCON
distances.) Both ESCON and FICON defined a single byte for the link address, the link address being the
port attached to “this” director. This changed in January 2003. Now it is possible to have two-director
configurations and separate geographic sites. This is done by adding the domain field of the Fibre Channel
destination ID to the link address to specify the exiting director and the link address of that director.
What is Cascaded FICON?
Cascaded FICON refers to an implementation of FICON that involves one or more FICON channel paths to be
defined over two FICON switches connected to each other using an Inter-Switch Link (ISL). The processor
interface is connected to one switch, while the storage interface is connected to the other. This
configuration is supported for both disk and tape, with multiple processors, disk subsystems, and tape
subsystems sharing the ISLs between the directors. Multiple ISLs between the directors are also supported.
Cascading between a director and a switch, for example from a Brocade 48000 director to a Brocade 5000
is also supported.
There are hardware and software requirements specific to cascaded FICON:
•
The FICON directors themselves must be from the same vendor (that is, both should be from Brocade)
•
The mainframes must be zSeries machines or System z processors: z800, 890, 900, 990, z9 BC or z9
EC. Cascaded FICON requires 64-bit architecture to support the 2-byte addressing scheme. Cascaded
FICON is not supported on 9672 G5/G6 mainframes.
•
z/OS version 1.4 or greater, and/or z/OS version 1.3 with required PTFs/MCLs to support 2-byte link
addressing (DRV3g and MCL (J11206) or later)
•
The high integrity fabric feature for the FICON switch must be installed on all switches involved in the
cascaded architecture. For Brocade M-Series directors or switches, this is known as SANtegrity Binding,
and it requires M-EOS firmware version 4.0 or later. For the Brocade 5000 Switch and 24000 and
48000 Directors, this requires Secure Fabric OS® (SFOS).
Cascaded FICON in a Brocade environment
4 of 40
MAINFRAME
Technical Brief
High Availability (HA), Disaster Recovery (DR), and Business Continuity (BC)
The greater bandwidth of and distance capabilities of FICON over ESCON are starting to make it an
essential and cost-effective component in HA/DR/BC solutions, the primary reason mainframe installations
are adopting cascaded FICON architectures. Since Sept 11, 2001, more and more companies are bringing
DR/BC in-house (“insourcing”) and companies are building the mainframe component of their new DR/BC
data centers using FICON rather than ESCON. Until IBM released cascaded FICON, the FICON architecture
was limited to a single domain due to the single-byte addressing limitations inherited from ESCON. FICON
cascading allows the end user to have a greater maximum distance between sites (up to an unrepeated
distance of 36 km at 2 Gbit/sec bandwidth). For details, see Tables 1 and 2.
Following September 11, 2001, industry participants met with government agencies, including the United
States Securities and Exchange Commission (SEC), the Federal Reserve, the New York State Banking
Department, and the Office of the Comptroller of the Currency. These meetings were held specifically to
formulate and analyze the lessons learned from the events of September 11, 2001. These agencies
released an interagency white paper, and the SEC released its own paper, on best practices to strengthen
the IT resilience of the US financial system. These events underlined how critical it is for an enterprise to be
prepared for disaster—even more for large enterprise mainframe customers. Disaster recovery is no longer
limited to problems such as fires or a small flood. Companies now need to consider and plan for the
possibility of the destruction of their entire data center and the people that work in it. A great many articles,
books and other publications have discussed the IT lessons learned from September 11, 2001:
•
To manage business continuity, it is critical to maintain geographical separation of facilities and
resources. Any resource that cannot be replaced from external sources within the Recovery Time
Objective (RTO) should be available within the enterprise. It is also preferable to have these resources
(buildings, hardware, software, data, and staff) in multiple locations. Cascaded FICON gives the
geographical separation required; ESCON does not.
•
The most successful DR/BC implementations are often based on as much automation as possible,
since key staff and skills may no longer be present after a disaster strikes.
•
Financial, government, military, and other enterprises now have critical RTOs measured in seconds and
minutes and not days and hours. For these end users it has become increasingly necessary to
implement insourced DR solution. This means that the facilities and equipment needed for the
HA/DR/BC solution are owned by the enterprise itself. In addition, cascaded FICON allows for
considerable cost savings compared with ESCON.
•
A regional disaster could cause multiple organizations to declare disasters and initiate recovery actions
simultaneously. This is highly likely to severely stress the capacity of business recovery services
(outsourced) in the vicinity of the regional disaster. Business continuity service companies typically
work on a “first come, first served” basis. So when a regional disaster occurs, these outsourcing
facilities can fill up quickly and be overwhelmed. Also, a company’s contract with the BC/DR outsourcer
may stipulate that the customer has the use of the facility only for a limited time (for example, 45 days).
This may spur companies with BC/DR outsourcing contracts to a)consider changing outsourcing firms,
b) re-negotiate an existing contract, or c) study the requirements and feasibility for insourcing their
BC/DR and creating their own DR site. Depending on an organization’s RTO and Recovery Point
Objective (RPO), option c) may be the best alternative.
•
The recovery site must have adequate hardware and the hardware at the recovery site must be
compatible with the hardware at the primary site. Organizations must plan for their recovery site to
have a) sufficient server processing capacity, b) sufficient storage capacity, and c) sufficient networking
and storage networking capacity to enable all business critical applications to be run from the recovery
site. The installed server capacity at the recovery site may be used to meet day-to-day needs (assuming
BC/DR is insourced). Fallback capacity may be provided via several means, including workload
prioritization (test, development, production, and data warehouse).
Cascaded FICON in a Brocade environment
5 of 40
MAINFRAME
Technical Brief
Fallback capacity may also be provided via a capacity upgrade scheme based on changing a license
agreement versus installing additional capacity. IBM System z and zSeries servers have the Capacity
Backup Option (CBU). Unfortunately in the open systems world, this feature is not common. Many
organizations will take a calculated risk with open systems and not purchase two duplicate servers (one
for production at the primary data center and a second for the DR data center). Therefore, open
systems DR planning account for this possibility and pose the question “What can I lose”?
•
A robust BC/DR solution must be based on as much automation as possible. It is too risky to assume
that key personnel with critical skills will be available to restore IT services. Regional disasters impact
personal lives as well. Personal crises and the need to take care of families, friends, and loved ones will
take a priority for IT workers. Also, key personnel may not be able to travel and will be unable to get to
the recovery site. Mainframe installations are increasingly looking to automate switching resources
from one site to another. One way to do this in a mainframe environment is with a cascaded FICON
Geographically Dispersed Parallel Sysplex (GDPS).
•
If an organization is to maintain business continuity, it is critical to maintain sufficient geographical
separation of facilities, resources, and personnel. If a resource cannot be replaced from external
sources within the RTO, it needs to be available internally and in multiple locations. This statement
holds true for hardware resources, employees, data, and even buildings. An organization also needs to
have a secondary disaster recovery plan. Companies that successfully recover to their designated
secondary site after losing their entire primary data center quickly come to the realization that all of
their data is now in one location. If disaster events continue or if there is not sufficient geographic
separation and a recovery site is also incapacitated, there is no further recourse (no secondary plan)
for most organizations.
What about the companies that initially recover a third party site with contractual agreements calling
for them to vacate the facility within a specified time period? What happens when you do not have a
primary site to go back to? The prospect of further regional disasters necessitates asking the question
“What is our secondary disaster recovery plan?”
This has led many companies to seriously consider implementing a three-site BC/DR strategy. What
this strategy entails is two sites within the same geographic vicinity to facilitate high availability and a
third, remote site for disaster recovery. The major objection to a three-site strategy is
telecommunication costs, but as with any major decision, a proper risk vs. cost analysis should be
performed.
•
Asynchronous remote mirroring becomes a more attractive option to organizations insourcing BC/DR
and/or increasing the distance between sites. While synchronous remote mirroring is popular, many
organizations are starting to give serious consideration to greater distances between sites and to a
strategy of asynchronous remote mirroring to allow further separation between their primary and
secondary sites.
HA/DR/BC implementations including GDPS, remote Direct-Attached Storage Device (DASD) mirroring,
electronic tape/virtual tape vaulting, and remote DR sites are all facilitated by cascaded FICON.
Cascaded FICON in a Brocade environment
6 of 40
MAINFRAME
Technical Brief
BENEFITS OF FICON CASCADING
Cascaded FICON delivers to the mainframe space many of the same benefits of open systems SANs. It
allows for simpler infrastructure management, decreased infrastructure cost of ownership, and higher data
availability. This higher data availability is important in delivering a more robust enterprise DR strategy.
Further benefits are realized when the ISLs connect switches in two or more locations and/or are extended
over long distances. Figure 1 shows a non-cascaded two-site environment.
Figure 1. Two sites in a non-cascaded FICON environment
In Figure 1, all hosts have access to all of the disk and tape subsystems at both locations. The host
channels at one location are extended to the Brocade 48000 or Brocade DCX (FICON) platforms at the
other location to allow for cross-site storage access. If each line represents two FICON channels, then this
configuration would need a total of 16 extended links; and these links would be utilized only to the extent
that the host has activity to the remote devices.
The most obvious benefit of cascaded versus non-cascaded is the reduction in the number of links across
the Wide Area Network (WAN). Figure 2 shows a cascaded, two-site FICON environment.
In this configuration, if each line represents two channels, only 4 extended links are required. Since FICON
is a packet-switched protocol (versus the circuit-switched ESCON protocol), multiple devices can share the
ISLs, and multiple I/Os can be processed across the ISLs at the same time. This allows for the reduction of
number of links between sites and allows for more efficient utilization of the links in place. In addition, ISLs
can be added as the environment grows and traffic patterns dictate.
This is the key way in which a cascaded FICON implementation can reduce the cost of the enterprise
architecture. In Figure 2, the cabling schema for both intersite and intrasite has been simplified. Fewer
intrasite cables translate into decreased cabling hardware and management costs. It also reduces the
number of FICON adapters, director ports, and host channel card ports required, thus decreasing the
connectivity cost for mainframes and storage devices as well. In Figure 2, the sharing of links between the
two sites reduces the number of physical channels between sites, thereby lowering the cost by
consolidating channels and the number of director ports. The faster the channel speeds between sites, the
Cascaded FICON in a Brocade environment
7 of 40
MAINFRAME
Technical Brief
better the intersite cost savings from this consolidation. So, with 4 Gbit/sec FICON and 10 Gbit/sec FICON
available, the more attractive this option becomes.
Another benefit to this approach, especially over long distances, is that the Brocade FICON director typically
has many more buffer credits per port than do the processor and the disk or tape subsystem cards. More
buffer credits allow for a link to be extended to greater distances without significantly impacting response
times to the host.
Figure 2. Two sites in a cascaded FICON environment
Optimizing Use of Storage Resources
ESCON limits the amount of terabytes (TB) that a customer can realistically have in a single DASD array,
because of device addressing limitations. Rather than filling a frame to capacity, additional frames need to
be purchased, wasting capacity. For example, running Mod 3 volumes in an ESCON environment typically
leads to running out of available addresses between 3.3 and 3.5 TB. This is significant because it requires
more disk array footprints at each site, and:
•
The technology of DASD arrays places a limit on the number of CU ports inside, and there is a limit of
8 links per LCU. These 8 links can only perform so fast.
•
This also limits the I/O density (I/Os per GB per second) into and out of the frame, placing a cap on the
amount of disk space the frame can support and still supply reasonable I/O response times.
Cascaded FICON lets customers fully utilize their old disk arrays, preventing them from having to “throttle
back” I/O loads and make the most efficient use of technologies such as Parallel Access Volumes (PAVs).
Additionally, a cascaded FICON environment requires fewer fiber adapters on storage devices and
mainframes.
Cascaded FICON allows for Total Cost of Ownership (TCO) savings in an installation’s mainframe
tape/virtual tape environment. FICON platforms such as the Brocade 48000 and DCX are “5 nines”
devices. The typical enterprise-class tape drive is only 2 or 3 nines at best due to all of the moving
mechanical parts. A FICON port on a Brocade DCX (or any FICON enterprise-class platform) typically costs
twice as much as a FICON port on a Brocade 5000 FICON switch. (The FICON switch is not a “5 nines”
Cascaded FICON in a Brocade environment
8 of 40
MAINFRAME
Technical Brief
device, while the FICON director is.) However, it may not make sense to connect “3 nines” tape drives to “5
nines” directors, when the best reliability achieved is that of the lowest common denominator (the tape
drive). Depending on your exact configuration, it can make more financial sense to connect tape drives to
Brocade 5000 FICON switches cascaded to a Brocade DCX (FICON), thus saving the more expensive
director ports for host and/or DASD connectivity.
Cascaded FICON Performance
Seven main factors affect the performance of a cascaded FICON director configuration (IBM white paper on
Cascaded FICON director performance considerations, Cronin and Bassener):
1.
The number of ISLs between the two cascaded FICON directors and the routing of traffic across ISLs
2.
The number of FICON/FICON Express channels whose traffic is being routed across the ISLs
3.
The ISL link speed
4.
Contention for director ports associated with the ISLs
5.
The nature of the I/O workload (I/O rates, block sizes, use of data chaining, and read/write ratio)
6.
The distances of the paths between the components of the configuration (the FICON channel links from
processor(s) to the first director, the ISLs between directors, and the links from the second director to
the storage control unit ports)
7.
The number of switch port buffer to buffer credits
The last factor—the number of buffer-to-buffer credits and the management of buffer to buffer credits—
is typically the one examined most carefully, and the one that is most often misunderstood.
BUFFER-TO-BUFFER CREDIT MANAGEMENT
The introduction of the FICON I/O protocol to the mainframe I/O subsystem provided the ability to process
data rapidly and efficiently. And as a result of two main changes that FICON made to the mainframe
channel I/O infrastructure, the requirements for a new Resource Measurement Facility (RMF) record came
into being. The first change was that unlike ESCON, FICON uses buffer credits to account for packet
delivery. The second change was the introduction of FICON cascading, which was not possible with ESCON.
Buffer-to-buffer credits (BB credits) and their management in a FICON environment is often a
misunderstood concept. Buffer-to-buffer credit management does have an impact on performance over
distances in cascaded FICON environments. At present, there is no good way to track BB credits being used.
At initial configuration, BB credits are allocated but not managed. As a result, the typical FICON shop
assigns a large number of BB credits for long-distance traffic. Just as assigning too many aliases to a base
address in managing dynamic PAVs can lead to configuration issues due to addressing constraints,
assigning too many BB credits can lead to director configuration issues, which can require outages to
resolve. Mechanisms for detecting BB credit starvation in a FICON environment are extremely limited.
This section reviews the concept of BB credits, including current schema for allocating them. It then
discusses the only way to detect BB credit starvation on FICON directors, including the concept of frame
pacing delay. Finally a mechanism to count BB credits used is outlined, and then another theoretical
“drawing board” concept is described: dynamic allocation of BB credits on an individual I/O basis similar to
the new HyperPAVs concept for DASD.
Cascaded FICON in a Brocade environment
9 of 40
MAINFRAME
Technical Brief
About BB Credits
This section is an overview of BB credits; for a more detailed discussion, consult Robert Kembel’s
“Fibre Channel Consultant” series.
Packet Flow and Credits
The fundamental objective of flow control is to prevent a transmitter from overrunning a receiver by allowing
the receiver to pace the transmitter, managing each I/O as a unique instance. At extended distances,
pacing signal delays can result in degraded performance. Buffer-to-buffer credit flow control is used to
transmit frames from the transmitter to the receiver and pacing signals back from the receiver to the
transmitter. The basic information carrier in the FC protocol is the frame. Other than ordered sets, which are
used for communication of low-level link conditions, all information is contained in the frames. A good
analogy to a frame is an envelope: When you send a letter via the United States Postal Service (USPS), the
letter is “encapsulated” in an envelope. When sending data via a FICON network, the data is encapsulated
in a frame (although service times in a FICON network are better than those of the USPS).
To prevent a target device (either host or storage) from being sent more frames than it has buffer
memory to store (overrun), the FC architecture provides a flow control mechanism based on a system
of credits. Each credit represents the ability of the receiver to accept a frame. Simply stated, a
transmitter cannot send more frames to a receiver than the receiver can store in its buffer memory.
Once the transmitter exhausts the frame count of the receiver, it must wait for the receiver to credit
back frames to the transmitter. A good analogy is a pre-paid calling card: there are a certain number
of minutes, and you can talk until there is no more time on the card.
Flow control exists at both the physical and logical level. The physical level is called “buffer-to-buffer
flow control” and manages the flow of frames between transmitters and receivers. The logical level
is called “end-to-end flow control” and it manages the flow of a logical operation between two end
nodes. It is important to note that a single end-to-end operation may have made multiple transmitterto-receiver pair hops (end-to-end frame transmissions) to reach its destination. However, the presence
of intervening directors and/or ISLs is transparent to end-to-end flow control. Buffer-to-buffer flow
control is the more crucial subject in a cascaded FICON environment.
Buffer-to-Buffer Flow Control
Buffer-to-buffer flow control is flow control between two optically adjacent ports in the I/O path (that is,
transmission control over individual network links). Each FC port has dedicated sets of hardware buffers
for send and receive operations. These buffers are more commonly known as “BB credits.”
The number of available BB credits defines the maximum amount of data that can be transmitted prior
to an acknowledgment from the receiver. BB credits are physical memory resources incorporated in the
Application Specific Integrated Circuit (ASIC) that manages the port. It is important to note that these
memory resources are limited. Moreover, the cost of the ASICs increases as a function of the size of the
memory resource. One important aspect of Fibre Channel is that adjacent nodes do not have to have the
same number of credits. Rather, adjacent ports communicate with each other during Fabric LOGIn (FLOGI)
and Port LOGIn (PLOGI) to determine the number of credits available for the send and receive ports on each
node.
Cascaded FICON in a Brocade environment
10 of 40
MAINFRAME
Technical Brief
A BB credit can transport a 2,112-byte frame of data. The FICON FC-SB-2 and FC-SB-3 ULPs use 64 bytes
of this frame for addressing and control, leaving 2 K available for z/OS data. In the event that a 2 Gbit/sec
transmitter is sending full 2,112-byte frames, 1 credit is required for every 1 km of fiber between the sender
and receiver. Unfortunately, z/OS disk workloads rarely produce full credits. For a 4 K transfer, the average
frame size is 819 bytes. Therefore, 5 credits would be required per km of distance as a result of the
decreased average frame size. It is important to note that increasing the fiber speed increases the number
of credits required to support a given distance. In other words, every time the distance doubles, the number
of required BB credits doubles to avoid transmission delays for a specified distance.
BB credits are used by Class 2 and Class 3 service and rely on the receiver sending back receiver-readies
(R_RDY) to the transmitter. As was previously discussed, node pairs communicate their number of credits
available during FLOGI/PLOGI. This value is used by the transmitter to track the consumption of receive
buffers and pace transmissions if necessary. FICON directors track the available BB credits in the following
manner:
•
Before any data frames are sent, the transmitter sets a counter equal to the BB credit value
communicated by its receiver during FLOGI.
•
For each data frame sent by the transmitter, the counter is decremented by one.
•
Upon receipt of a data frame, the receiver sends a status frame (R_RDY) to the transmitter, indicating
that the data frame was received and that the buffer is ready to receive another data frame.
•
For each R_RDY received by the transmitter, the counter is incremented by one.
As long as the transmitter count is a non-zero value, the transmitter is free to continue sending data. This
mechanism allows for the transmitter to have a maximum number of data frames in transit equal to the
value of BB credit, with an inspection of the transmitter counter indicating the number of receive buffers.
The flow of frame transmission between adjacent ports is regulated by the receiving port’s presentation
of R_RDYs; in other words, BB credits has no end-to end-component. The sender decrements the BB credit
by one for each R_RDY received. The initial value of the BB credit count must be non-zero. The rate of frame
transmission is regulated by the receiving port based on the availability of buffers to hold received frames.
It should be noted that the FC-FS specification allows the transmitter to be initialized at zero, or at the value
of the BB credit count and either count up or down on frame transmit. Different switch vendors can handle
this using either method, and the counting would be handled accordingly.
Implications of Asset Deployment
There are four implications of asset deployment to consider when planning BB-credit allocations:
•
For write-intensive applications across an ISL (tape and disk replication) the BB credit value advertised
by the E_Port on the target gates performance. In other words, the number of BB credits on the target
cascaded FICON director is the major factor.
•
For read-intensive applications across an ISL (regular transactions) the BB credit value advertised
by the E_Port on the host gates performance. In other words, the number of BB credits at the local
location is the major factor.
•
Two ports do not negotiate BB credits down to the lowest common value. A receiver simply “advertises”
BB credits to a linked transmitter.
•
The depletion of BB credits at any point between an initiator and a target will gate overall throughput.
Cascaded FICON in a Brocade environment
11 of 40
MAINFRAME
Technical Brief
Configuring BB Credit Allocations on FICON Directors
There have been two FICON switch architectures for BB credit allocation. The first, which was prevalent on
early FICON directors such as the Inrange/CNT FC9000 and McDATA 6064, had a range of BB credits that
could be assigned to each individual port. Each port on a port card had a range of BB credits (for example 4
through 120) that could be assigned to it during the switch configuration process. Simple rules of thumb on
a table/matrix were used to determine the number of BB credits to use. Unfortunately, these tables did not
consider workload characteristics or z/OS particulars. Since changing the BB credit allocation was an offline
operation, most installations would calculate what they needed, set the allocation, and (assuming it was
correct) not look at it again. Best practice was typically to maximize BB credits used on ports being used for
distance traffic, since each port could theoretically be set to the maximum available BB credits without
penalizing other ports on the port card. Some installations would even maximize the BB credit allocation
on short-distance ports, so they would not have to worry about it. However, this could cause other kinds of
problems in recovery scenarios.
The second FICON switch architecture, on the market today in products from Brocade and Cisco, has a pool
of available BB credits for each port card in the director. Each port on the port card has a maximum setting.
However, since there is a large pool of BB credits that must be shared among all ports on a port card, there
must be better allocation planning. It is no longer enough to simply use distance rules of thumb. Workload
characteristics of traffic need to be better understood. Also, as 4 Gbit/sec FICON Express4 becomes
prevalent and 8 Gbit/sec FICON Express8 follows, intra-data-center distances become something to
consider when deciding how to allocate the pool of available BB credits. It no longer is enough to say that
a port is internal to the data center or campus and assign it the minimum number of credits. This pooled
architecture and careful capacity planning it necessitates make it more critical than ever to have a way to
track actual BB credit usage in a cascaded FICON environment.
What follows is a discussion of what happens when you exhaust available BB credits and the concept of
frame pacing delay.
BB Credit Exhaustion and Frame Pacing Delay
Similar to the ESCON directors that preceded them, FICON switches have a feature called “Control Unit Port
(CUP)”. Among the many functions of the CUP feature is an ability to provide host control functions such as
blocking and unblocking ports, safe switching, and in-band host communication functions such as port
monitoring and error reporting. Enabling CUP on FICON switches while also enabling RMF 74 subtype 7
(RMF 74-7) records for the z/OS system, yields a new RMF report called the “FICON Director Activity
Report.” Data is collected for each RMF interval if FCD is specified in the ERBRMFnn parmlib member.
RMF will format one of these reports per interval per each FICON switch that has CUP enabled and the
parmlib specified. This RMF report contains meaningful data on FICON I/O performance—in particular,
frame pacing delay. Note that frame pacing delay is the only indication available to indicate a BB credit
starvation issue on a given port.
Frame pacing delay has been around since FC SAN was first implemented in the late 1990s by our open
systems friends. But until the increased use of cascaded FICON, its relevance in the mainframe space has
been completely overlooked. If frame pacing delay is occurring, then the buffer credits have reached zero
on a port for an interval of 2.5 microseconds and no more data can be transmitted until a credit has been
added back to the buffer credit pool for that port. Frame pacing delay causes unpredictable performance
delays. These delays generally result in longer FICON connect time and/or longer PEND times that show up
on the volumes attached to these links. Note that only when using switched FICON and only when CUP is
enabled on the FICON switching device(s) can RMF provide the report that provides frame pacing delay
information. Only the RFM 74-7 FICON Director Activity Report provides FICON frame pacing delay
information. You cannot get this information from any other source today.
Cascaded FICON in a Brocade environment
12 of 40
MAINFRAME
Technical Brief
Figure 3. Sample FICON Director Activity report (RMF 74-7)
The fourth column from the left in Figure 3 is the column where frame pacing delay is reported. Any number
other than 0 (zero) in this column is an indication of frame pacing delay occurring. If there is a non-zero
number it reflects the number of times that I/O was delayed for 2.5 microseconds or longer due to buffer
credits falling to zero. Figure 3 shows an optimal situation, zeros down the entire column indicating that
enough buffer credits are always available to transfer FICON frames.
Figure 4. Frame pacing delay indications in RMF 74-7 record
But in Figure 4, you can see that on the FICON Director Activity Report for switch ID 6E, an M6140 director,
there were at least three instances when port 4, a cascaded link, suffered frame pacing delays during this
RMF reporting interval. This would have resulted in unpredictable performance across this cascaded link
during this period of time. The next few sections provide answers to questions that arise in this discussion.
Cascaded FICON in a Brocade environment
13 of 40
MAINFRAME
Technical Brief
What is the difference between frame pacing and frame latency?
Frame pacing is an FC4 application data exchange measurement and/or throttling mechanism. It uses
buffer credits to provide a flow control mechanism for FICON to assure delivery of data across the FICON
fabric. When all buffer credits for a port are exhausted, a frame pacing delay can occur. Frame latency, on
the other hand, is a frame delivery measurement, similar to measuring frame friction. Each element that
handles the frame contributes to this latency measurement (CHPID port, switch/director, storage port
adapter, link distance, and so on). Frame latency is the average amount of time it takes to deliver a frame
from the source port to the destination port.
What can you do to eliminate or circumvent frame pacing delay?
If a long-distance link is running out of buffer credits, then it might be possible to enable additional buffer
credits for that link in an attempt to provide an adequate pool of buffer credits for the frames being
delivered over that link. But the number of buffer credits required to handle specific workloads across
distance is surprising, as shown in Table 1.
Table 1. Frame size, link speed, and distance determine buffer credit requirements
Å Frame Æ
Buffer Credits Required to 50 km
Payload %
Payload Bytes
1 Gbit/sec
2 Gbit/sec
4 Gbit/sec
8 Gbit/sec
10 Gbit/sec
100%
2112
25
49
98
196
290
75%
1584
33
65
130
259
383
50%
1056
48
96
191
381
563
25%
528
91
181
362
723
1069
10%
211
197
393
785
1569
2318
5%
106
321
641
1281
2561
3784
1%
21
656
1312
2624
5248
7755
Keep in mind that tape workloads generally have larger payloads in a FICON frame, while DASD workloads
might have much smaller frame payloads. Some say the average payload size for DASD is often about 800
to 1500 bytes. By using the FICON Director Activity reports for your enterprise, you can gain an
understanding of your own average read and write frames sizes on a port-by-port basis.
To help you, columns five and six of the FICON Director Activity report in Figure 3 show the average read
frame size and the average write frame size for the frame traffic on each port. These columns are useful
when you are trying to figure out how many buffer credits will be needed for a long-distance link or possibly
to solve a local frame pacing delay issue.
Cascaded FICON in a Brocade environment
14 of 40
MAINFRAME
Technical Brief
How can you make improvements?
Even with the new FICON directors and the ability to assign BB credits to each port from a pool of available
credits on each port card, it is still not easy. The best hope for end users is to make a “correct” allocation
and then monitor the RMF 74-7 report for frame pacing delay to indicate that they are out of BB credits.
They can then make the necessary adjustments to the BB credit allocations to crucial ports, such as the ISL
ports on either end of a cascaded link. However, any adjustments made will merely be a “guestimate,”
since the exact number being used is not indicated. A helpful analogy is a car without a fuel gauge in which
you have to rely on EPA MPG estimates to calculate how many miles you could drive on a full tank of gas.
This estimate would not reflect driving characteristics, and in the end, the only accurate indication that the
gas tank is empty is a coughing engine that stops running.
Individual ports track BB credit availability, as was discussed earlier, and the mechanism by which this
occurs was described. So it is a matter of creating a reporting mechanism. This is similar to a situation with
monitoring open exchanges, discussed in a paper by Dr. H. Pat Artis, who made a sound case for why open
exchange management is crucial in a FICON environment. He proved the correlation between
response/service time skyrocketing and open exchange saturation, demonstrated how channel busy
and bus busy metrics are not correlated to response/service time, and recommended a range of open
exchanges to use for managing a FICON environment. Since RMF does not report open exchange counts,
he derived a formula using z/OS response time metrics to calculate open exchanges. Commercial software
such as MXG and RMF Magic use this to help users better manage their FICON environments.
Similar to open exchanges, the data needed to calculate BB credit usage is currently available in RMF, and
all that is needed are some mathematical calculations. As an area of future exploration, the RMF 74-7
record (FICON Director Activity report) could be updated with these two additional fields and the appropriate
interfaces added between the FICON switches and CUP code. Switch management software could also be
enhanced to include these two valuable metrics.
Dynamic Allocation of BB Credits
The technique used in BB credit allocation is very similar to the early technique used in managing PAV
aliases. The simple approach used was called “static assignment.” With static assignment, the storage
subsystem utility was used to statically assign alias addresses to base addresses. While a generous static
assignment policy could help to ensure sufficient performance for a base address, it resulted in ineffective
utilization of the alias addresses (since nobody knew what the optimal number of aliases was for a given
base), which put pressure on the 64 K device address limit. Users would tend to assign an equal number of
addresses to each base, often taking a very conservative approach, resulting in PAV alias overallocation.
An effort to address this was undertaken by IBM with WorkLoad Manager (WLM) support for dynamic alias
assignment. WLM was allowed to dynamically reassign aliases from a pool to base addresses to meet
workload goals. Since this can be somewhat “lethargic,” users of dynamic PAVs still tend to overconfigure
aliases and are pushing the 64 K device address limitation. Users face what you could call a “PAV
performance paradox”: they need the performance of PAVs, tend to overconfigure alias addresses, and are
close to exhausting the z/OS device addressing limit.
Perhaps a similar dynamic allocation of BB credits, in particular for new FICON switch architectures having
pools of assignable credits on each port card, would be a very beneficial enhancement for end users.
Perhaps an interface between the FICON directors and WLM could be developed to allow WLM to
dynamically assign BB credits. At the same time, since Quality of Service (QoS) is an emerging topic
for FICON, an interface could be developed between the FICON switches and WLM for functionality with
dynamic channel path management and priority I/O queuing to enable true end-to-end QOS.
In October 2006, IBM announced HyperPAVs for the DS8000 storage subsystem family to address the PAV
performance paradox. HyperPAVs increase the agility of the alias assignment algorithm. The primary
difference between the traditional PAV alias management is that aliases are dynamically assigned to
individual I/Os by the z/OS I/O Supervisor (IOS) rather than being statically or dynamically assigned to
Cascaded FICON in a Brocade environment
15 of 40
MAINFRAME
Technical Brief
a base address by WLM. The RMF 78-3 (I/O queuing) record has also been expanded. A similar
feature/functionality and interface between FICON switches and the z/OS IOS would be the ultimate in
BB credit allocation: true dynamic allocation of BB credits on an individual I/O basis.
This section has reviewed flow control, basics of BB credit theory, frame pacing delay, current BB credit
allocations methods and presented some proposals for a) counting BB credit usage and b) enhancing how
BB credits are allocated and managed.
TECHNICAL DISCUSSION OF FICON CASCADING
As stated earlier, cascaded FICON is limited to zSeries and System z processors only with the hardware and
software requirements outlined earlier.
In Figure 2, note that a cascaded FICON switch configuration involves at least three FC links:
•
Between the FICON channel card on the mainframe (known as an N_Port) and the FICON director’s FC
adapter card (which is considered an F_Port)
•
Between the two FICON directors via E_Ports (the link between E_Ports on the switches is an interswitch link)
•
Link from the F_Port to a FICON adapter card in the control unit port (N_Port) of the storage device.
The physical paths are the actual FC links connected by the FICON switches providing the physical
transmission path between a channel and a control unit. Note that the links between the cascaded FICON
switches may be multiple ISLs, both for redundancy and to ensure adequate I/O bandwidth.
Fabric Addressing Support
Single-byte addressing refers to the link address definition in the Input-Output Configuration Program
(IOCP). Two-byte addressing (cascading) allows IOCP to specify link addresses for any number of domains by
including the domain address with the link address. This allows the FICON configuration to create
definitions in IOCP that span more than one switch.
Figure 5 shows that the FC-FS 24 bit FC port address identifier is divided into three fields:
•
Domain
•
Area
•
AL Port
In a cascaded FICON environment, 16 bits of the 24-bit address must be defined for the zSeries server to
access a FICON CU. The FICON switches provide the remaining byte used to make up the full 3-byte FC port
address of the CU being accessed. The AL_Port (arbitrated loop) value is not used in FICON and is set to a
constant value. The zSeries domain and area fields are referred to as the F_Port’s port address field.
It is a 2-byte value, and when defining access to a CU attached to this port using the zSeries Hardware
Configuration Definition (HCD) or IOCP, the port address is referred to as the link address. Figure 5 further
illustrates this, and Figure 6 is an example of a cascaded FICON IOCP gen.
Cascaded FICON in a Brocade environment
16 of 40
MAINFRAME
Technical Brief
Figure 5. Fabric addressing support (a)
Figure 6. Fabric addressing support (b)
Cascaded FICON in a Brocade environment
17 of 40
MAINFRAME
Technical Brief
The connections between the two directors are established through the Exchange of Link Parameters (ELP).
The switches pause for a FLOGI, and assuming that the device is another switch, they initiate an ELP
exchange. This results in the formation of the ISL connection(s).
In a cascaded FICON configuration, three additional steps occur beyond the normal FICON switched pointto-point communication initialization. A much more detailed discussion of the entire FICON initialization
procedure can be found in Chapter 3 of the IBM Redbook, “FICON Native Implementation and Reference
Guide,” pp 23-43.
The three basic steps are:
1.
If a 2-byte link address is found in the CU macro in IOCDS, a Query Security Attribute (QSA) command
is sent by the host to check with the fabric controller on the directors if the directors have the high
integrity fabric features installed.
2.
The director responds to the QSA.
3.
If it is an affirmative response, indicating that a high integrity fabric is present (fabric binding and
insistent domain ID), the login continues. If not, login stops and the ISLs are treated as invalid (not a
good thing).
Figure 7. Sample IOCP coding for FICON cascaded switch configuration
Cascaded FICON in a Brocade environment
18 of 40
MAINFRAME
Technical Brief
High Integrity Enterprise Fabrics
Data integrity is paramount in a mainframe or any data center environment. End-to-end data integrity must
be maintained throughout a cascaded FICON environment to ensure that any changes made to the data
stream are always detected and that the data is always delivered to the correct end point. Brocade M-Series
FICON directors in a cascaded environment use a software feature know as SANtegrity to achieve this. The
SANtegrity feature key must be installed and operational in the Brocade Enterprise Fabric Connectivity
Manager (EFCM). Brocade 24000 and 48000 FICON directors and the Brocade 5000 FICON switch use
Secure Fabric OS.
What does high integrity fabric architecture and support entail?
•
Support of Insistent Domain IDs. This means that a FICON switch will not be allowed to automatically
change its address when a duplicate switch address is added to the enterprise fabric. Intentional
manual operator action is required to change a FICON director’s address. Insistent Domain IDs prohibit
the use of dynamic Domain IDs, ensuring that predictable Domain IDs are being enforced in the fabric.
For example, suppose a FICON director has this feature enabled, and a new FICON director is
connected to it via an ISL in an effort to build a cascaded FICON fabric. If this new FICON director
attempts to join the fabric with a domain ID that is already in use, the new director is segmented into a
separate fabric. It also makes certain that duplicate Domain IDs are not used in the same fabric.
•
Fabric Binding. Fabric binding enables companies to allow only FICON switches that are configured to
support high-integrity fabrics to be added to the FICON SAN. For example, a Brocade M-Series FICON
director without an activated SANtegrity feature key cannot connect to an M-Series FICON
fabric/director with an activated SANtegrity feature key. The FICON directors that you wish to connect
to the fabric must be added to the fabric membership list of the directors already in the fabric. This
membership list is composed of the “acceptable” FICON director’s World Wide Name (WWN) and
Domain ID. Using the Domain ID ensures that there will be no address conflicts, that is, duplicate
domain IDs when the fabrics are merged. The two connected FICON directors then exchange their
membership list. This membership list is a Switch Fabric Internal Link Service (SW_ILS) function, which
ensures a consistent and unified behavior across all potential fabric access points.
Managing Cascaded FICON Environments and ISLs: Link Balancing and Aggregation
Even in over-provisioned storage networks, there may be “hot spots” of congestion, with some paths
running at their limit while others go relatively unused. In other words, the storage network may be a
performance bottleneck even if it has sufficient capacity to deliver all I/O without constraint. This typically
happens when a network does not have the intelligence to load balance across all available paths. The
unused paths may still be of some value for redundancy, but not for performance. Brocade has several
options for supporting more evenly balanced cascaded FICON networks.
NOTE: The FICON and SAN FC protocol (the FC-SW standard) utilizes path routing services that are based on
the industry-standard Fabric Shortest Path First (FSPF) algorithm of that FC protocol. This is not the CHPID
path; it is the connections between FICON switching devices (which cause a network to be created) that will
utilize FSPF.
FSPF allows a fabric (created when CHPIDs and storage ports are connected through one or more FICON
switching devices) composed of more than one switching device (also called a storage network) to
automatically determine the shortest route from each switch to any other switch. FSPF selects what it
considers to be the most efficient path to follow when moving frames through a FICON fabric. FSPF
identifies all the possible routes (multiple path connections) through the fabric and then manages initial
route selection as well as sub-second path rerouting in the event of a link or node failure.
Cascaded FICON in a Brocade environment
19 of 40
MAINFRAME
Technical Brief
The Brocade 5000 (FICON), Brocade 24000 and 48000 (FICON) Directors, and the Brocade DCX (FICON)
Backbone support source-port route balancing via FSPF. This is known as Dynamic Load Sharing (DLS) and
is part of the base FOS as long as fabric and E_Port functions are present. FSPF makes calculations based
on the topology of a FICON network and determines the cost between end points. In many cascaded FICON
topologies, there is more than one equal-cost path across ISLs. Which path to use can be controlled on a
per-port basis from the source switch. By default, FSPF attempts to spread connections from different ports
across available paths at the source-post level. FSPF can re-allocate routes whenever in-order delivery can
still be assured (DLS). This may happen when a fabric rebuild occurs, when device cables are moved, or
when ports are brought online after being disabled. DLS does a “best effort” job of distributing I/O by
balancing source port routes.
However, some ports may still carry more traffic than others, and DLS cannot predict which ISLs will be
“hot” when it sets up routes since they must be allocated before I/O begins. Also, since traffic patterns tend
to change over time, no matter how routes were distributed initially, it would still be possible for hot spots to
appear later. Changing the route allocation randomly at runtime could cause out-of-order delivery, which is
undesirable in mainframe environments. Balancing the number of routes allocated to a given path is not
the same as balancing I/O, and so DLS does not do a perfect job of balancing traffic. DLS is useful, and
since it is free and works automatically, it is frequently used. However, DLS does not solve or prevent most
performance problems, so there is a need for more evenly balanced methods, such as trunking.
On Brocade M-Series FICON switches, FSPF works automatically by maintaining a link state database that
keeps track of the links on all switches in the FICON fabric and also associates a cost with each link in the
fabric. Although the link state database is kept on all FICON switches in the fabric, it is maintained and
synchronized on a fabric-wide basis. Therefore, every switch knows what every other switch knows about
connections of host, storage, and switch ports in the fabric. Then FSPF associates a cost with each ISL
between switching devices in the FICON fabric and ultimately chooses the lowest-cost path from a host
source port, between switches, to a destination storage port. And it does this in both directions, so it would
also choose the lowest-cost path from a storage source port, between switches, to a destination host port.
The process works as follows. FSPF is invoked at PLOGI. At initial power on of the mainframe complex and
after the fabric build and security processes have been fulfilled, individual ports supported by the fabric
begin their initial PLOGI process. As each port (CHPID and storage port) logs into a cascaded FICON fabric,
FSPF assigns that port (whether it will ever need to use a cascaded link or not) to route I/O over a specific
cascaded link. Once all ports have logged in to the fabric, I/O processing can begin. If any port is taken
offline and then put back online, it will go through PLOGI again and the same or a different cascaded link
might be assigned to it.
There is one problem with FSPF routing—it is static. FSPF decisions are made in the absence of data
workflow that may prove to be inappropriate for the real-world patterns of data access between mainframe
and storage ports. Since FSPF cannot know what I/O activity will occur across any specific link, it is only
concerned about providing network connectivity. It has only a very shallow concern about performance—
number of hops (which for FICON is 1 so that metric is always equal) and speed of each cascaded link
(which can be different and can result in costing each cascaded link as a lower-to-higher cost link).
FSPF static routing can result in some cascaded links being over-congested (due to a shortage of buffer
credits and/or high utilization of bandwidth) and other cascaded links being under-utilized. FSPF does not
take this into account as its only real function is to ensure that connectivity has been established. Although
mainframe end users have long exploited the MVS and z/OS ability to provide automatic CHPID I/O loadbalancing mechanisms, there is not an automatic load-balancing mechanism built into the FC-SW-2 or FCSW-3 protocol when cascaded links are used.
Cascaded FICON in a Brocade environment
20 of 40
MAINFRAME
Technical Brief
So on one hand a FICON cascaded fabric allows you to have tremendous flexibility and ultra high availability
in the I/O architecture. You can typically enjoy decreased storage and infrastructure costs, expanded
infrastructure consolidation options, ease of total infrastructure management, thousands more device
addresses, access to additional storage control units per path, optimized use of all of your storage
resources, and higher data availability. Also, higher data availability in a cascaded FICON zSeries or z9
environment implies better, more robust DR/BC solutions for your enterprise. So from that point of view,
FICON cascading has many positive benefits.
But on the other hand a plain, FSPF-governed, unmanaged FICON cascaded environment injects
unpredictability into enterprise mainframe operations where predictability has always ruled. So you must
take back control of your FICON cascaded environment to restore predictability to mainframe operations
and stable, reliable, and predictable I/O performance to applications.
All vendors provide the following:
•
Some form of cascaded link “trunking”
•
A choice of link speeds for the deployment of cascaded links
•
A means of influencing FSPF by configuring a preferred path (cascaded link) between the FICON
switches on a port-by-port basis
•
A means to prevent a frame in a FICON switching device from transferring from a source port to a
blocked destination port—including cascaded link port.
But what do these mechanisms mean to you and how do you decide what to use to control your
environment to obtain the results you want? First you have to know what you want to accomplish. Often you
want to have the system “automatically” take care of itself and to adjust to changing conditions for
management simplicity and elasticity of the I/O system in general to respond to situational workloads and
unusual events. For other enterprises, it might be rigid control over the environment even if it means more
work in managing the environment and less elasticity in meeting shifts in I/O workload hour to hour, day to
day. So choosing the correct management strategy means that you must have a general understanding of
each of the cascaded link control mechanisms, so that you can wisely plan your environment.
The next section presents best practices in FICON cascaded link management.
Cascaded FICON in a Brocade environment
21 of 40
MAINFRAME
Technical Brief
BEST PRACTICES FOR FICON CASCADED LINK MANAGEMENT
The best recommendation to start with is to avoid managing FICON cascaded links manually! By doing so
you will circumvent much tedious work—work that is prone to error and is always static in nature. Instead,
implement FICON cascaded path management, which automatically responds to changing I/O workloads
and provides a simple, labor-free but elegant solution to a complex management problem. This simplified
management scheme can be deployed through a combination of using the free, automatic FSPF process
and enabling a form of ISL trunking on each switching device in the FICON fabric.
This section explores ISL trunking in greater detail. Brocade offers several trunking options for the Brocade
5000, 24000, 48000, and DCX platforms; Brocade M-Series FICON directors offer a software- based
trunking feature known as “Open Trunking.”
Terms and Definitions
•
Backpressure. A condition in which a frame is ready to be sent out of a port but there is no transmit BB
credit available for it to be sent as a result of flow control from the receiving device.
•
Bandwidth. The maximum transfer bit-rate that a link is capable of sustaining; also referenced in this
document as “capacity.”
•
Domain. A unique FC identifier assigned to each switch in a fabric; a common part of the FC addresses
assigned to devices attached to a given switch.
•
Fabric Shortest Path First (FSPF). A standard protocol executed by each switch in a fabric, by which the
shortest paths to every destination domain are computed output to a table that gives the transmit ISLs
allowed when sending to each domain. Each such transmit ISL is on a shortest path to the domain, and
FSPF allows any one of them to be used.
•
Flow. FC frame traffic arriving in a switch on a specific receive port that is destined for a device in a
specific destination FC domain elsewhere in the fabric. All frames for the same domain arriving on the
receive port are said to be in the same flow.
•
Oversubscription. A condition that occurs when an attempt is made to use more resources than are
available, for example when two devices could source data at 1 Gbit/sec and their traffic is routed
through one 1 Gbit/sec ISL, the ISL is oversubscribed.
Frame-level Trunking Implementation
Trunking allows traffic to be evenly balanced across ISLs while preserving in order delivery. Brocade offers
hardware (ASIC)-based, frame-level trunking and exchange-level trunking on the Brocade 5000, 24000,
48000, and DCX platforms. The frame-level method balances I/O such that each successive frame may go
down a different physical ISL, and the receiving switch ensures that the frames are forwarded onward in
their original order. Figure 8 shows a frame-level trunk between two FICON switches. For this to work there
must be high intelligence in both the transmitting and receiving switches.
At the software level, switches must be able to auto-detect that forming a trunk group is possible, program
the group into hardware, display and manage the group of links as a single logical entity, calculate the
optimal link costs, and manage low-level parameters such as buffer-to-buffer credits and Virtual Channels
optimally. Management software must represent the trunk group properly. For the trunking feature to have
broad appeal, this must be as user-transparent as possible.
Cascaded FICON in a Brocade environment
22 of 40
MAINFRAME
Technical Brief
At the hardware level, the switches on both sides of the trunk must be able to handle the division and
reassembly of several multi-gigabit I/O streams at wire speed, without dropping a single frame or delivering
even one frame out of order. To add to the challenge, there are often differences in cable length between
different ISLs. Within a trunk group, this creates a skew between the amounts of time each link takes to
deliver frames. This means that the receiving ASIC will almost always receive frames out of order and must
be able to calculate and compensate for the skew to re-order the stream properly.
There are limitations to the amount of skew that an ASIC can tolerate, but these limits are high enough that
they do not generally apply. The real-world applicability of the limitation is that it is not possible to configure
one link in a trunk to go clockwise around a large dark-fiber ring, while another link goes counterclockwise.
As long as the differences in cable length are measured in a few tens of meters or less, there will not be an
issue. If the differences are larger than this, a trunk group cannot form. Instead, the switch creates two
separate ISLs and uses either DLS or DPS to balance them.
Figure 8. Frame-level trunking concept
The main advantage of Brocade frame-level trunking is that it provides optimal performance: a trunk group
using this method truly aggregates the bandwidth of its members. The feature also increases availability by
allowing non-disruptive addition of members to a trunk group, and minimizing the impact of failures.
However, frame-level trunking does have some limitations. On the Brocade 5000, Brocade 24000 (with 16port, 4 Gbit/sec blades) and 48000 Directors, and the DCX, it is possible to configure multiple groups of up
to eight 4 Gbit/sec links each. The effect is the creation of balanced 32 Gbit/sec pipes (64 Gbit/sec fullduplex). When connecting a Brocade 48000 or other 4 Gbit/sec switch to a 2 Gbit/sec switch, a “lowest
common denominator” approach is used, meaning that the trunk groups is limited to 4x 2 Gbit/sec instead
of 8x 4 Gbit/sec.
Frame-level trunking requires that all ports in a given trunk must reside within an ASIC port-group on each
end of the link. While a frame-level trunk group outperforms either DLS or DPS solutions, using links only
within port groups limits configuration options. The solution is to combine frame-level trunking with one of
Cascaded FICON in a Brocade environment
23 of 40
MAINFRAME
Technical Brief
the other methods, as illustrated in Figure 8, which shows frame-level trunking operating within port groups,
and DLS operating between trunks. On the Brocade 48000 and DCX, trunking port groups are built on
contiguous 8-port groups called “octets.” There are four octets: ports 0 – 7, 8 – 15, 16 – 23, and 24 – 31.
The Brocade 5000, 48000, and DCX have flexible support for trunking over distance. Buffers are shared
across 16-port groups, not limited by octets. For example, it is possible to configure up to 8-port 4 Gbit/sec
trunks at 40 km (32 Gbit/sec trunk group) or 4-port 4 Gbit/sec trunks at 80 km (16 Gbit/sec trunk group).
In some cases it may even be more desirable to configure trunks using 2 Gbit/sec links. For example, the
trunk group may cross a DWDM that does not have 4 Gbit/sec support. In this case, an 8-port 2 Gbit/sec
trunk can span up to 80 km. The above example is per 16-port blade or per 16 ports on the 32-port blade in
the Brocade 48000.
Brocade M-Series Director Open Trunking
Open Trunking is an optionally licensed software feature that provides automatic, dynamic, statistical traffic
load balancing across ISLs in a fabric environment. This feature can be enabled on a per-switch basis and it
operates transparently to the existing FSPF algorithms for path selection in a fabric. It employs Template
Registers in the port hardware and measures flow data rates and ISL loading—and then it uses these
numbers to optimize use of ISL bandwidth. The feature controls FC traffic at a flow level rather than at a
per-frame level (as is implemented in some hardware trunking deployments), in order to achieve optimal
throughput. It does not require any special cooperation from (or configuration of) the adjacent switch. This
feature complies with current Fibre Channel ANSI standards and can be used on McDATA switches in
homogeneous as well as heterogeneous fabrics.
Configuration and management of Open Trunking is provided via management interfaces (EFCM, CLI and
SANpilot/EFCM Basic) through the following mechanisms:
•
McDATA feature key support. A unique feature key is required for each switch that will have Open
Trunking enabled
•
Open Trunking enable/disable. A user configurable parameter that allows Open Trunking to be
supported on all ISLs for a switch; the default is “disabled.”
•
Per-port offloading thresholds. When the bandwidth consumption of outbound traffic on an ISL
exceeds the configured threshold, an attempt may be made to move flows to other equal-cost, but less
heavily loaded ISLs.
•
Per-switch low BB credit threshold. When the percentage of time that a port spends with 0 (zero) BB
credit exceeds this threshold, an attempt may be made to move flows to other equal-cost, but less
heavily loaded ISLs
•
Event generation enable/disable for “Low BB Credit Threshold Exceeded” and “Bandwidth
Consumption Threshold Exceeded”. If enabled, these events appear in the Event Log, as well as
events that indicate when the condition has ended.
•
Open Trunking Reroute Log. This log contains entries that indicate a flow reroute.
The objective of the Open Trunking feature is to make the most efficient use of redundant ISLs. Consider
the fabric configuration in Figure 10 with five HBA N_Ports (on the right), six storage N_Ports (on the left),
and four ISLs—and assume that all N_Ports and ISLs are 2 Gbit/sec. SW1 and SW2 are two Brocade MSeries FICON directors that support Open Trunking.
Cascaded FICON in a Brocade environment
24 of 40
MAINFRAME
Technical Brief
Figure 9. Fabric configuration with Open Trunking
Without Open Trunking, M-EOS software makes only a simple attempt to balance the loads on the four ISLs
by allocating receive N_Ports round-robin to transmit ISLs. This results in each of SW2’s transmit ISLs
carrying data from no less than one and no more than two HBAs, and each of SW1’s transmit ISLs carrying
data from no less than one and no more than two disks. While this sort of load balancing is better than
nothing, it has a major shortcoming: Actual ISL bandwidth oversubscription is not taken into account. If
HBA1 and HBA5 are trying to send data at 2 Gbit/sec each while HBA2, HBA3, and HBA4 are sending little
or no data, it is possible that HBA1 and HBA5 nevertheless find themselves transmitting their data on the
same ISL. If each ISL has 2 Gbit/sec capacity, the result is that both HBA1 and HBA5 see their effective
data rate cut in half, even though 75 percent of the total bandwidth between SW1 and SW2 is unused.
Open Trunking periodically examines traffic statistics and reroutes traffic as needed from heavily loaded
ISLs to less-loaded ISLs. It does this rerouting by modifying switch hardware forwarding tables. Traffic may
be rerouted from an ISL of one capacity to an ISL of another capacity if that would improve the overall
balance of traffic. Open Trunking is performed using the FSPF shortest-path routing database. In M-Series
switches, all ISLs are assigned equal FSPF cost so that all paths with the minimum number of ISL hops can
be used. (This FSPF link cost is independent of the Open Trunking cost functions discussed later.) The
result is that the shortest paths from a given switch to a given destination domain often use transmit ISLs
that have different speeds or go to different adjacent switches. When rerouting for load balancing, Open
Trunking may reroute traffic among all such ISLs.
Open Trunking is not restricted to rerouting among ISLs of the same bandwidth. Special care is taken when
balancing loads among ISLs of different speed for two reasons: First, the user-perceived latency from a
high-bandwidth ISL versus a low-bandwidth ISL at the same loading level is not normally the same; it can be
expected to be higher for the low-bandwidth ISL even though both have the same percentage loading. So
simply equalizing the percentage loading on the two does not work. Second, it is very easy to inadvertently
swamp a low-bandwidth ISL by offloading traffic from a high-bandwidth ISL if the statistics for that traffic are
underestimated, as is frequently the case when traffic is ramping up. Much of the complexity in the
algorithms used is due to the problem of rerouting safely among ISLs having differing bandwidths.
Cascaded FICON in a Brocade environment
25 of 40
MAINFRAME
Technical Brief
Use of Data Rate Statistics by Open Trunking
Open Trunking measures as accurately as possible these three statistics:
•
The long-term (about a minute or so) statistical rates of data transmission between each ingress port
(ISL or N_Port) and each destination domain
•
The long-term statistical loading of each ISL, measured in the same time span as the above
•
The long-term average percentage of time spent with zero transmit BB credits for each ISL.
In the initial release of Open Trunking, a combination of ingress port and destination domain is called a
“flow.” So the first item in the list above simply states that the statistical data rate of each flow is measured.
Open Trunking uses these statistics to reroute flows as needed so as to minimize overall perceived
overloading. For example, in Figure 10, if ISL1 is 99 percent loaded and has traffic from HBA1 and HBA2,
while ISL2 is 10 percent loaded with traffic from HBA3, it might reroute either the flow from HBA1 or HBA2
onto ISL2. The choice is determined by flow statistics: If the flow from HBA1 to SW1 is 1.9 Gbit/sec, it does
not reroute that flow, because doing so would overload ISL2. In that case only the flow from HBA2 to SW1 is
rerouted.
Unfortunately, Open Trunking cannot help ISLs that spend a lot of time unable to transmit due to lack of BB
credits. This is a condition that is normally caused by overloaded ISLs or poor-performing N_Ports
elsewhere in the fabric, not at the local switch. The 0 (zero) BB credit statistic is primarily used to ensure
that Open Trunking does not make things worse by rerouting traffic onto ISLs that are lightly used but have
little or no excess bandwidth due to credit starvation.
It should be noted that the 0 (zero) BB credit statistic is not just the portion of time spent unable to transmit
due to credit starvation. It also includes the portion of time spent transmitting with no more transmit
credits. Since a credit is consumed at the start of a frame and not at the end of a frame, an ISL that is
transmitting may have no transmit BB credits. It is common for an ISL to be 100 percent loaded and still
have a 0 (zero) transmit BB credit statistic of close to 100 percent.
Rerouting Decision Making
At the core of Open Trunking is a cost function that computes a theoretical cost of routing data on an ISL.
It is this cost function that makes it possible to compare loading levels of links with different bandwidth,
1 Gbit/sec versus 2 Gbit/sec: a 1 Gbit/sec ISL with 0.9 Gbit of traffic is not equally as loaded as a
2 Gbit/sec ISL with 0.9 Gbit of traffic. The cost function is based on the ISL loading and the link bandwidth.
As a function of the ISL loading, it is steadily increasing with increasing slope.
All rerouting decisions are made so as to minimize the cost function. This means that a flow is rerouted
from ISL x to ISL y only if the expected decrease in the cost function for ISL x, computed by subtracting the
flow’s data rate from ISL x’s data rate, is greater than the expected increase in the cost function for ISL y.
In fact, to enhance stability of the system, the expected increase in the cost function for ISL y must be at
least 10 percent less than the expected decrease in the cost function for ISL x.
The cost functions are kept in pre-compiled tables, one for each variety of ISL (currently 2 Gbit/sec and
1 Gbit/sec). The 10 percent differential mentioned above is hard-coded in the tables.
The cost function is needed mainly because of the difficulty of making rerouting decisions among ISLs of
different bandwidths; without this requirement Open Trunking could reroute in such a way as to minimize
the maximum ISL loading.
Cascaded FICON in a Brocade environment
26 of 40
MAINFRAME
Technical Brief
Checks on the Cost Function
Making improvement of the cost function the sole condition for rerouting would create an unacceptable and
unnecessary risk of instability in routing for these reasons:
•
Statistics cannot be measured with 100 percent accuracy.
•
Statistics, even when measured accurately, may be in a state of flux when measured.
•
The cost function can be improved by offloading traffic from a lightly loaded ISL onto an even more
lightly loaded ISL, but the minimal improvement in latency would be imperceptible to the user.
To put it simply, too many flows would be rerouted too often if flows were rerouted every time the cost
function could be improved. Therefore multiple checks have been implemented on the rerouting selection
process. These all prevent flows from being rerouted, even in cases in which the cost function would be
improved. Some of these can be adjusted by Brocade EFCM, CLI, or SANpilot configuration as follows:
•
Two versions of the ISL statistical data rate are kept, one designed to underestimate the actual data
rate and the other designed to overestimate it. When making a rerouting decision, the statistics are
used in such a way as to result in the most conservative (least likely to reroute) decision.
•
No flow is rerouted from an ISL unless the ISL utilization is above a minimum threshold, called the
“offloading bandwidth consumption threshold,” or unless it spends more than “low BB credit threshold”
portion of its time unable to transmit due to lack of BB credits. If one of these conditions is not present,
there is no condition that justifies the cost of rerouting. Both of these parameters are user
configurable.
•
No flow is rerouted to an ISL unless the ISL expected utilization, computed by adding the flow data rate
to the ISL current data rate, is less than an “onloading bandwidth consumption threshold.” There is an
onloading bandwidth consumption threshold for each ISL capacity. This threshold is not user
configurable.
•
No flow can be rerouted if it has been rerouted recently. A period of “flow reroute latency” must expire
between successive reroutes of the same flow. This latency is not user configurable.
Periodic Rerouting
Periodically, every “load-balancing period,” a rerouting task runs that scans all flows and decides which
ones to reroute using the criteria discussed above. The load-balancing period is not user configurable.
Cascaded FICON in a Brocade environment
27 of 40
MAINFRAME
Technical Brief
Algorithms to Gather Data
Exponential Smoothing
Averages for all statistics measured are kept by means of an exponential smoothing algorithm. The
algorithm is partly controlled by a parameter called the “basic averaging time.” This number is 0.093 times
the statistical half-life, the time over which a statistic loses half its influence on the smoothed average. The
longer the basic averaging time, the slower the system is in reacting to momentary spikes in statistics. This
parameter is not user configurable.
Use of Template Registers for Flow Statistics
There is no way to measure simultaneously the data rates of all flows in the architecture of any McDATA
switch product that supports Open Trunking. Flow data rates have to be sampled using a small number of
Template Registers. Each Template Register can be set to measure a single flow at a time. Template
Registers examine each flow in turn, counting data for any one flow for a period of “sample tree sample
time.” The numbers gathered are statistically weighted and exponentially smoothed to provide a flow data
rate statistic for use in rerouting decision-making.
Frame size estimation
The Template Registers on non-4XXX series switches measure frame counts, not word or byte counts. But
trunking requires word counts, because flow data rates are compared to ISL capacities measured in words
per second. On 4XXX-series products, the Template Registers count actual word rates.
On non-4XXX series switches, under normal circumstances, this problem is resolved by multiplying the
statistical frame rates by the size of a maximum-size frame plus the minimum inter-frame gap inserted by a
non-4XXX switch (which should be 40 to 50 words). This overestimates the data rate, but it is safer (less
likely to result in unnecessary reroutes) to overestimate it than to underestimate it. Besides, most frames
tend to be close to maximum size in applications having a high data rate.
However, if it is impossible to relieve bandwidth oversubscription on an ISL using this overestimate, a frame
size estimation algorithm is activated. This algorithm computes the average transmit frame size for the
flow’s transmit ISL and computes a weighted average between it and the maximum frame size. This
weighted average is then multiplied by a flow’s frame rate to approximate the flow’s data-rate. The
weighting is adjusted to favor the average frame size, versus the maximum, as long as flows cannot be
rerouted from a heavily loaded ISL and is adjusted the other way when they can be or when it is not heavily
loaded. The effect of this is that an overloaded ISL that stays overloaded tends to use an average frame
size that is close to the average transmit frame size.
The speed at which this happens is controlled by the “frame size estimation-weighting factor” (not user
configurable). The default of 64 is chosen so that it takes significantly longer than the half-life of the
exponential averaging algorithm to switch to the smaller estimated frame size. Decreasing the frame size
estimation factor makes this convergence occur proportionately faster and may result in an unstable
system; increasing it increases stability but may slow down rerouting if there are a lot of very small frames
in the traffic mix.
Cascaded FICON in a Brocade environment
28 of 40
MAINFRAME
Technical Brief
Summary of Open Trunking Parameters
The following table summarizes the parameters used for Open Trunking. Note that only parameters
with a “Yes” in the “User Set” column can be changed by an EFCM, CLI or SANpilot/EFCM Basic user.
Table 2. Open trunking parameters
Name
User
Set
Default
What it Affects
Comments
Basic averaging
time
No
5575 ms
Speed at which statistics
reflect changes
Internally, load-balancing period
should be adjusted with it.
Flow reroute
latency
No
60 sec
Minimum time between
reroutes for a flow
Every reroute has a chance of
misordering frames.
Sample tree
sample time
No
100 ms
Flow data rate accuracy, CTP
processor loading due to flow
statistics (the most CPUintensive operation Open
Trunking performs)
Only increase (internally) if CTP
processor loading is too high.
Load-balancing
period
No
45 sec
Rate at which flows are
checked for rerouting
Consider (internally) adjusting
basic averaging time with it.
Failover disable
time
No
60 sec
Time rerouting disabled after
failover
Internally, adjust if rerouting
instability seen on failover.
Offloading
bandwidth
consumption
threshold
Yes
Default
offloading
bandwidth
consumption
threshold for
ISL capacity
Loading level above which
rerouting is considered for the
ISL. It may be set individually
for each ISL, or the user may
select use of defaults per ISL
Adjust down for very latencyintensive applications. Adjust
with onloading bandwidth
consumption threshold.
Default offloading
bandwidth
consumption
threshold
(1 G and 2 G)
No
66% (1G)
These are the values 1
Gbit/sec and 2 Gbit/sec ISLs
respectively use for defaults
(Internal) See above.
Onloading
bandwidth
consumption
threshold
(1 G and 2 G)
No
Loading level below which ISL
is eligible to have traffic
rerouted to it
Internally, adjust along with
offloading bandwidth consumption
threshold.
Frame size
estimation
weighting
No
64
Speed at which an extremely
oversubscribed ISL switches
from using maximum frame
size to using average xmt
frame size in flow data rate
computation
Low BB credit
threshold
Yes
50%
Threshold on percentage of
sample time during which the
ISL has experienced a 0 (zero)
BB credit condition
75% (2G)
75% (4G)
75% (4G)
66% (1G)
75% (2G)
75% (4G)
75% (4G)
Cascaded FICON in a Brocade environment
Reroutes occur from ISL if overall
load balance can be improved by
doing so and/or reroutes are
prevented to this ISL when
threshold is exceeded.
29 of 40
MAINFRAME
Technical Brief
Fabric Tuning Using Open Trunking
The default configuration for Open Trunking event generation is “disabled.” When the feature is enabled, it
is recommended that these events be left disabled unless the user is explicitly monitoring Open Trunking
behavior with an eye to tuning or optimizing the fabric. When enabled, these events will indicate detected
conditions that can be improved or alleviated by examining the traffic patterns through the entire fabric.
The interpretations given to the two sets of events related to Open Trunking are as follows:
•
Bandwidth consumption threshold exceeded on an ISL
Explanation: Open Trunking firmware has detected that there is an ISL that has Fibre Channel traffic
that exceeds the configured offload threshold.
Action: Review the fabric topology using the switch topology guidelines. This can be relieved by adding
parallel ISLs, increasing the link speed of the ISL, or by moving devices to different locations in the
fabric to avoid this condition.
•
Low BB credit threshold exceeded on an ISL
Explanation: Open Trunking has detected a transmit ISL that has no credits for data transmission for a
portion of time greater than the low BB credit threshold. This is a possible indication of heavy loading or
oversubscription in the fabric downstream from the exit port if the available bandwidth usage on the
ISL is not close to 100 percent.
Action: Review the fabric topology using the switch topology guidelines. This can be relieved
downstream by adding parallel ISLs, increasing the link speed of the ISL, or by moving devices to
different locations in the fabric to avoid this condition. If this condition is brief and rare or if the
reporting ISL has close to 100 percent throughput, this may be ignored. Manually increasing this
configured threshold toward 100 percent when close to 100 percent bandwidth is being utilized will
reduce the frequency of these events. Slow-draining downstream non-ISL devices may also be a cause
for this event, and adding ISLs will not alleviate the occurrence of this event for those situations.
Open Trunking Enhancements
Rerouting has impacts: Whenever traffic is rerouted as a result of Open Trunking or other infrequent fabric
situations such as the loss of an ISL, there is a possibility of out-of-order frame delivery. Therefore the
algorithms used by Open Trunking are extremely cautious and are based on long-term stable usage
statistics. A significant change in traffic patterns must last for about a minute or longer, depending on the
situation, before Open Trunking can be expected to react to it.
Significant improvements to Open Trunking were implemented in M-EOS 6.0 to reduce the likelihood of a
reroute causing frames to arrive out of order at N_Ports. Some devices react adversely when they receive
an Out-Of-Order Frame (OOOF), sometimes triggering retry processing of the FCP Exchange that can take as
long as a minute.
These out-of-order frames caused by Open Trunking reroutes can trigger occasional BF2D, xx1F, and AB3E
errors on EMC Symmetrix FC adapters. Some Open Systems hosts will log temporary disk access types of
events, and Windows hosts attached to EMC CLARiiON® arrays might see Event 11s. In FICON
environments, an InterFace Control Check (IFCC) error can result from an out-of-order frame. Note,
however, that discarded frames could also trigger many of these same problems. Open Trunking is
specifically designed to alleviate congestion conditions that often cause discarded frames.
Cascaded FICON in a Brocade environment
30 of 40
MAINFRAME
Technical Brief
In order to reduce the likelihood of these types of host or array problems, significant resources were
invested to reducing the occurrence of OOOFs when Open Trunking determines a reroute is necessary.
M-EOS 6.0 and later includes optimizations to allow the original path to drain remaining queued frames
prior to starting transmission on the new rerouted path. In addition to a small delay to allow frames to drain
from the current egress port, a small additional delay is included to allow the downstream switch some time
to deliver the frames it has already received. This prevents OOOFs resulting from unknown downstream
congestion when the re-route occurs.
Even with the improvements to reduce OOOF delivery by temporarily delaying transmission of frames on
the new path, it is strongly recommended that Open Trunking be used in single-hop configurations to reduce
incremental increase in the possibility of an OOOF. Fabric configurations with more than one hop are
acceptable as long as the hop count between data paths (N_Port to N_Port) is limited to one.
Through extensive testing, it has been determined that the delay imposed by allowing the original path to
drain does not significantly impede performance. In fact, the net delay introduced with these enhancements
is typically less than 10 ms. In most situations, the congestion resolved by the reroute typically would have
caused much longer frame delivery delays than the new Open Trunking behavior introduces. Product testing
also shows that the new enhancements have virtually eliminated the occurrence of OOOFs, even in
extremely congested and dynamic fabric conditions.
Open Trunking Summary
Brocade M-Series Open Trunking feature automatically balances performance throughout a FICON
cascaded storage network, while minimizing storage administrator involvement in that management.
Brocade Open Trunking could be characterized as “load balancing” in that it detects conditions where
FICON is experiencing congestion on a single cascaded link and checks to see if there are other
uncongested cascaded links available. If it can relieve the congestion, it permanently shifts some of the
traffic, essentially "balancing" the loads across all available cascaded links over time. The term "dynamic
load-balancing" is often used to reflect the fact that it continuously monitors for cascaded link congestion
and can automatically rebalance the flows across these fabric links at any time, adjusting as traffic patterns
change and continuously balancing loads as conditions dictate.
Although Brocade calls this intelligent fabric management scheme “Open Trunking,” it is dissimilar to the
traditional “hardware trunking,” because cascaded links are not grouped into "trunks" and data flows are
not interleaved across those trunks of cascaded links. There are benefits to hardware trunking for certain
aspects of fabric behavior, but there are also drawbacks. There are restrictions for which ports can be
trunked and simultaneous over-congestion on these hardware trunked ports must be constantly monitored.
Open Trunking is much more flexible in this regard, because it sets no limit to the number of ports that can
be used for an “open trunk” group. Regarding the interleaving capability of hardware trunks, unless you
experience cascaded link congestion, you do not want to "balance" an ISL. No benefit is derived from frame
interleaving over cascaded links that are not suffering from congestion.
Open Trunking is invisible to all mainframe applications and requires no user interaction; it is truly
“automatic”. So FSPF can and should do your initial cascaded link routing automatically, and then Open
Trunking immediately and automatically solves cascaded link congestion problems that occur, when they
occur and without your involvement in the Brocade M-Series FICON environment.
Cascaded FICON in a Brocade environment
31 of 40
MAINFRAME
Technical Brief
Controlling FICON Cascaded Links in More Demanding Environments
Sometimes customers have requirements to control how FICON cascaded links are used explicitly.
For example, deploying an intermixed FICON and FC infrastructure might create one of these more rigid
environments. “Fat Pipe” high-speed cascaded links might also need to be managed to service one specific
environment and not another. So you need to understand what mechanisms are available to you under
these circumstances and others situation that you might encounter specific to your enterprise.
FSPF and some variation of trunking can still be used to automate as much cascaded link decongestion as
possible. But they must be influenced by other tools to give us greater manual control over the complete
infrastructure. Two additional tools are available to influence the allocation and decongestion of cascaded
links—Preferred Path and Prohibit Path.
Preferred Path on M-Series FICON Switches
Preferred Path is an optional feature for Brocade M-Series switches that allows you to influence the route
of data traffic when it traverses multiple switches in a fabric. Using Preferred Path and your in-depth
knowledge of your own I/O environment, you can define routes across a fabric and specify your preference
regarding the assignment of CHPIDs and storage ports to specific cascaded links.
If more than one cascaded link (ISL) connects switches in a fabric, you can specify a cascaded link
preference for a particular flow. The data path consists of the source port of switch being configured, the
exit port of that switch, and the domain ID of the destination switch, as shown in Figure 10.
Each switch must be configured for its part of the desired path to achieve optimal performance. You may
need to configure Preferred Paths for all switches along the desired path for proper multi-hop Preferred
Path operation. Preferred Path can be configured using either CLI or Brocade EFCM. Preferred Path allows
you to control which cascaded links certain applications use based on the ports to which the channels and
devices are connected, while allowing failover to alternate cascaded links should the preferred path fail.
Figure 10. Specifying a Preferred Path using Brocade EFCM
Cascaded FICON in a Brocade environment
32 of 40
MAINFRAME
Technical Brief
If a Preferred Path fails, FSPF assigns a functional cascaded link to the F_Port flows on the failed
connection. When the Preferred Path is returned to service, its original F_Port flows are re-instated. Any
F_Ports that are not assigned to a cascaded link via Preferred Path are assigned to a cascaded link using
the FC SPF process.
NOTE: If FSPF becomes involved in allocating cascaded links, then these flows will co-exist with the flows
created using Preferred Path on the same cascaded links. This could create congestion on one or more
cascaded links if it is not taken into consideration.
The limitations of Preferred Path are as follows:
•
Open Trunking does not manage Preferred Path data flows, so Open Trunking cannot do any automatic
decongestion of Preferred Path links.
•
Preferred Path cannot be used to ensure a FICON Cascaded 1-hop environment. This is due to the fact
that Preferred Path will fail over to another random cascaded link path if its primary Preferred Path
fails, which could lead to a multi-hop FICON fabric. Use port blocking to ensure that FICON pathing
contains only a single hop.
•
The Brocade 48000 and DCX do not currently support "preferred pathing" or "blocked E_ports" for
cascaded FICON directors. Brocade has a command called uRouteConfig, which allows you to set up
static routes for specific ports on the switch. But uRouteConfig requires aptpolicy=1 (port-based
routing). (Note that IBM requires port-based routing for Brocade 48000 and DCX FICON Cascade
Mode.) However, uRouteConfig is not supported when the chassis configuration is chassisconfig 5,
which is the chassis configuration for the Brocade 48000 and DCX.
Here are some best practices for using Preferred Path:
•
If you are going to use Preferred Path, then assign every F_Port in the fabric to a cascaded link using
Preferred Path and do not let FSPF do any cascaded link assignments.
•
Disk and tape should not share the same cascaded links. FICON and FCP traffic should also be
separated across the cascaded links. Use Preferred Path in these cases to direct these frame traffic
flows to different cascaded links.
Prohibit Paths
A FICON frame and an FCP frame are essentially the same with the exception of the data payload the frame
is carrying. So, technically speaking, a cascaded link (ISL) can carry both FICON and FCP frames
consecutively over their links with no problems. But many customers want to separate FICON network traffic
from SAN network traffic (and maybe even from disk replication traffic), and then make sure that these
different flows of network traffic remain independent but uncongested as well. Keep in mind that creating
these separate network data flows across a fabric is a business decision and not a technical decision. But if
it is a business requirement, there are technical means to do it.
If you run systems automation and have implemented the function known as “I/O operations (IO-Ops),” then
from the MVS or zOS console you can use in-band FICON commands that allow you to block or unblock
ports on any FICON switching devices. If you do not use IO-Ops, then by utilizing EFCM you can configure the
PDCMs by using the address configuration matrix to block or unblock ports. In that case, you use EFCM
management routines rather than MVS or zOS procedures to do this blocking and unblocking.
First, block all SAN ports on a switching device in the fabric from transferring frames to all of the FICON
ports on the switching device. Then, block all FICON ports on a switching device in the fabric from
transferring frames to all of the SAN ports on the switching device. This completely stops the flow of frame
traffic from the blocked ports as both source and destination ports. It is done at the hardware level, so
regardless of how you zone a port or prefer a path, a frame will NEVER pass to the blocked port from the
Cascaded FICON in a Brocade environment
33 of 40
MAINFRAME
Technical Brief
source port unless and until you revise the PDCM blocking configuration. And you can do exactly this same
blocking for the cascaded link ports.
For example, for 10 cascaded links, you can use 4 links for SAN FC traffic only and the other 6 for FICON
traffic only. Choose 6 of the cascaded links for FICON only and block all SAN ports from using those cascaded
links. Block those 6 cascaded link ports from connecting to all of the SAN ports. Then perform the same
procedure, but this time block the 4 remaining cascaded links away from the FICON ports. This creates two
separate flows of network traffic that will never intermingle. At this point, you should consider implementing
some form of trunking to manage both of these now physically separate network data flows independently;
decongesting cascaded links but only in the set of cascaded links assigned to that specific flow.
PDCM port blocking is the strongest method you can use to control frame flow through a switching device.
For that reason you should be very careful when you use it, since it affects FSPF, Preferred Path, and
Trunking algorithms. FSPF cannot assign a blocked cascaded link as a route across the network for ports
that are blocked from using it. You can configure a blocked cascaded link as a Preferred Path, but no
frames will ever be sent to it from the ports that are blocked.
Open Trunking cannot move work from a congested cascaded link to an uncongested cascaded link if that
uncongested link is blocked from connecting to the port with the work flow that will be moved.
NOTE: If you are experiencing a problem getting a switching port to connect to another switching port, it
might be a PDCM, hardware-blocked port. A blocked port can be very difficult to diagnose and quickly
troubleshoot. The only way you will know is to check the PDCM addressing matrix using FICON CUP or EFCM.
Figure 11. Prohibit path and the PDCM addressing matrix
Cascaded FICON in a Brocade environment
34 of 40
MAINFRAME
Technical Brief
Traffic Isolation Zones on B-Series FICON Switches
Using Traffic Isolation (TI) Zones, you can provision certain E_Ports to carry only traffic flowing from a
specific set of source ports. This allows you to control the flow of interswitch traffic, for example, to dedicate
an ISL to high-priority, host-to-target traffic. Or it might be used to force high-volume (but lower-priority)
traffic onto a given ISL to limit the effect of this high traffic pattern on the fabric at large. In either case a TI
zone can be created that contains the set of N_Ports and the set of E_Ports to use for specific traffic flows.
When a TI zone has been set up, the traffic entering a switch from one of the given set of ports (E_Ports or
N_Ports) use sonly those E_Ports defined within that zone for traffic to another domain. But if there is no
other way to reach a destination other than by using an E_Port that is not part of that zone, that E_Port is
still used to carry traffic from and to a device in its group. This is the default behavior of TI zones, unless it is
overridden by creating a TI zone with failover disabled. In a TI zone with failover disabled, when any of the
E_Ports comprising the TI-Zone go down, and E_Port that does not belong to TI zone will not be used to carry
traffic and the traffic isolation path is deemed broken. Similarly, the E_Port belonging to a particular traffic
isolation zone does not carry any other traffic belonging to devices outside the zone unless that E_Port is
the only way to reach a given domain.
The TI zones appear in the defined zone configuration only and not in the effective zone configuration.
A TI zone is only used for providing traffic isolation, and zone enforcement is based on the regular userconfigured zones.
Consider the following when you are thinking about using TI zones:
•
TI zones are supported on Condor and Condor 2 (ASIC) Brocade FICON switches running in Brocade
native mode. TI zones cannot be used in FICON environments running in interop or McDATA fabric
mode.
•
TI zones are not defined in an FC standard and are unique to Brocade. However, their design conforms
to all underlying FC standards, in the same way as base Fabric OS.
•
TI zones are not backward compatible, so traffic isolation is not supported in FICON environments
with switches running firmware versions earlier than FOS 6.0.0. However, TI zones in such a fabric do
not disrupt fabric operation in switches running older firmware versions. You must create a TI zone with
members belonging to FICON switches that run firmware version 6.0.0 or later.
When a zone is marked as a TI zone, the fabric attempts to isolate all inter-switch traffic entering a switch
from a member of that zone to only those E_Ports that have been included in the zone. In other words, the
domain routes for any of the members (N_Port or E_Port) to the domains of other N_Port members of the
zone are set to use an E_Port included in the zone, if it exists. Such domain routes are used only if they are
on a lowest-cost path to the target domain (that is, the FSPF routing rules will continue to be obeyed). The
fabric will also attempt to exclude traffic from other TI zones from using E_Ports in a different TI zone. This
traffic shaping is a “best effort” facility that will do its work only as long as doing so does not violate the
FSPF “lowest cost route” rules. This means that traffic from one TI zone may have to share E_Ports with
other TI zones and devices when no equal-cost routes can be found using a “preferred” E_Port. And if a
“preferred” E_Port fails, traffic fails over to a “non-preferred” E_Port if no preferred E_Ports offer a lowestcost route to the target domain. Similarly, a non-TI device’s traffic uses an E_Port from a TI zone if no equal
cost alternatives exist.
As mentioned earlier, TI zones do not appear in the effective zone set for a number of reasons. First, the
members are defined using D,I notation. Doing so allows the routing controls to be determined at the time
the zones are put into effect, eliminating the significant overhead that would be required if WWNs were
used and the routing controls were discovered incrementally, as devices come online. But the use of D,I in
TI zones would cause issues on switches running versions of FOS earlier than 6.0.0 if included in the
effective set, setting all zones referencing devices included in a TI zone to Session mode based on mixed
mode zoning. Additionally, the intent of a TI zone is to control routing of frames among the members and
Cascaded FICON in a Brocade environment
35 of 40
MAINFRAME
Technical Brief
is not intended to “zone them all together.” The Zone daemon (zoned) extracts all TI zones from the defined
zone database whenever a change is made to the defined database and pushes them to the nsd for
application.
When a TI zone is being activated, the nsd in each switch determines if any of the routing preferences
for that zone apply to the local switch. This determination must include the appropriate screening for
Administrative Domain (AD) membership if ADs are being used. If any valid TI zones are found that apply
to members on this switch, the nsd in turn pushes the TI zone to fspfd. The FSPF daemon (fspfd) is
responsible for applying the routing controls specified by TI zones.
The fspfd applies those preferences using a new set of APIs (that is, ioctls) provided by the kernel routing
functions.
Figure 12. An example of the use of TI zones
Consider the following TI zones created for the fabric shown in Figure 12:
Zone --create -t ti “redzone” -e “1,1; 2,2; 2,4; 2,6; 3,8; 4,5” -n “1,8; 1,9; 3,6; 3,7; 4,8; 4,9; 4,7”
Zone --create –t ti “bluezone” -e 1,10; 2,10; 2,20; 3,12; 4,20” -n “1,20; 3,22; 4,10”
The TI zone redzone creates dedicated paths from Domains 1, 3, and 4 through the core switch Domain 2.
All traffic entering Domain 1 from device ports 8 and 9 are routed through port 1, regardless of which
domain they are going to. And no traffic coming from other ports in Domain 1 uses port 1, again regardless
of which domain it is going to. Similarly, any traffic entering Domain 2 from port 2 is routed only to port 8 or
6 when going to Domains 3 or 4 respectively. And port 2 is used solely for traffic coming from ports 4 or 6
(the other redzone E_Ports in Domain 2).
Each TI zone is interpreted by each switch and each switch considers only the routing required for its local
ports. No consideration is given to the overall topology and to whether the TI zones accurately provide
dedicated paths through the whole fabric. For example, the TI zone called “bluezone” creates a dedicated
path between the two blue devices on Domains 1 and 3 (port 20, 22). However, a misconfiguration of
Domain 4 will result in port 20 being used only for traffic coming from the device on port 10 (that is, a
dedicated E_Port for outbound traffic), but that traffic uses only the “black” E_Ports to go to Domains 3 or 1.
Similarly, all blue traffic coming into Domain 2 goes to Domain 4 through one of the 44 – 50 ports, since no
blue E_Port has been configured in Domain 2 that connects to Domain 4. Nothing fatal will occur, but the
results may not meet expectations. The correct configuration would have included 3,44 in the E_Port list.
Cascaded FICON in a Brocade environment
36 of 40
MAINFRAME
Technical Brief
TI Zones Best Practices
A few general rules for Traffic Isolation zones:
•
An N_Port can be a member of only a single TI zone, because a port can have only one route to any
specific domain. This “non-duplication” rule is enforced during zone creation and modification. If ADs
are configured, this checking is done only against the current ADs Zone database. The zone --validate
command checks against the defined database of all ADs.
•
An E_Port can be a member of only a single TI zone. Since an E_Port can be a source port (that is, for
incoming frames) as well as a destination, the same “one route to a specific domain” rule applies to
E_Ports and forces this limitation. The same checking is done as described for N_Ports.
•
If multiple E_Ports are configured that are on the lowest-cost route to a domain, the various source
ports for that zone are load balanced across the specified E_Ports.
•
A TI zone provides exclusive access to E_Ports (for outbound traffic) as long as other equal-cost, nondedicated E_Ports exist. Only source ports included in the zone are routed to zone E_Ports as long as
other paths exist. If no other paths exist, the dedicated E_Ports are used for other traffic. Note that
when this occurs, all traffic routed to the “dedicated” E_Port uses the dedicated path through switches,
regardless of which ports are the source.
•
No port can appear in a TI zone and an ISL Binding zone.
A few more rules if ADs are in effect:
•
If used within an AD, the E_Ports specified in a TI zone must be in that AD’s device list, enforced during
zone creation and modification.
•
Since TI zones must use D,I notation, the AD’s device list must be declared using D,I for ports that are
to be used in such zones, enforced during zone creation and modification.
•
Take care if you are using TI zones for shared ports (E_Ports or N_Ports) because of the limitation that
a given port can appear in only one TI zone. Conflicting members across ADs can be detected by the
use of zone –validate, and best practice dictates that such situations not be allowed to persist. (It
might be best not to allow ISL Bind or TI zones to reference a shared N_Port or E_Port, since one AD
administrator can then interfere with actions of another AD administrator. But this may be hard to do.)
Following is an example of implementing FICON and FCP (SAN) intermix on the same fabric(s) to more rigidly
control FICON and cascaded links in this type of environment.
The challenge for mixing FCP and FICON comes from the management differences between the two
protocols, primarily the mechanism for controlling device communication. Because FICON and FCP are FC4
protocols, they do not affect the actual switching of frames, therefore the differences are not relevant until
the user wants to control the scope of the switching through zoning or connectivity control. Name Server
zoning used by FCP devices, for example, provides fabric-wide connection control. By contrast, PDCM
connectivity control typically used by FICON devices provides switch-wide connection control.
Mainframe and storage vendors strongly recommend that if you are implementing intermix you should block
the transfer of any and all frames from a FICON switch port to all SAN connected ports. And then you will
need to do the reverse as well, blocking the transfer of any and all frames from a SAN switch port to all
FICON connected ports. But what about the cascaded links (called ISLs in the SAN world)? Can they be
shared by both FICON and FCP?
Cascaded FICON in a Brocade environment
37 of 40
MAINFRAME
Technical Brief
SUMMARY
For the mainframe customer, FICON cascading offers new capabilities to help meet the requirements of
today’s data center. Your challenge is to ensure performance across the FICON fabric’s cascaded links
to insure the highest possible level of data availability and application performance at the lowest
possible cost.
Cascaded FICON in a Brocade environment
38 of 40
MAINFRAME
Technical Brief
APPENDIX: FIBRE CHANNEL CLASS 4 CLASS OF SERVICE (COS)
Some initial QoS efforts were made in the T11 Standards group to develop a QoS standard for FC. It was
written as a Class of Service, and it was very complex. Consultants worked with the major switch vendors
to develop a set of proposals that impacted several different standards. A summary of Class 4 follows. It
was never formally adopted or implemented. The discussion of Class 4 is included to reinforce the point
that QoS is a complex topic, and not just a marketing buzzword.
A Fibre Channel class of service can be defined as a frame delivery scheme exhibiting a specified set of
delivery characteristics and attributes. ESCON and FICON are both part of the FC standard and class of
service specifications.
•
Class 1. A class of service providing a dedicated connection between two ports with confirmed delivery
or notification of non-delivery.
•
Class 2. A class of service providing a frame switching service between two ports with confirmed
delivery or notification of non-deliverability.
•
Class 3. A class of service providing a frame switching datagram service between two ports or a
multicast service between a multicast originator and one or more multicast recipients.
•
Class 4. A class of service providing a fractional bandwidth virtual circuit between two ports with
confirmed delivery or notification of non-deliverability.
Class 4 is frequently referred to as a “virtual circuit” class of service. It works to provide better quality of
service guarantees for bandwidth and latency than Class 2 or Class 3 allow, while providing more flexibility
than Class 1 allows. Similar to Class 1, it is a type of dedicated connection service. Class 4 is a connectionoriented class of service with confirmation of delivery (acknowledgement) or notification that a frame could
not be processed (reject). Class 4 provides for the allocation of a fraction of the bandwidth on a path
between two node ports and guarantees latency within negotiated QoS bounds. It provides a virtual circuit
between a pair of node ports with guaranteed bandwidth and latency in addition to the confirmation of
delivery or notification of non-deliverability of frames. For the duration of the Class 4 virtual circuit, all
resources necessary to provide that bandwidth are reserved for that virtual circuit, so it is frequently
referred to as a “virtual circuit class of service.”
Unlike Class 1, which reserves the entire bandwidth of the path, Class-4 supports the allocation of a
requested amount of bandwidth. The bandwidth in each direction is divided up among up to 254 Virtual
Circuit (VC) connections to other N_Ports on the fabric. When the virtual circuits are established, resources
are reserved for the subsequent delivery of Class 4 frames. Like Class 1, Class 4 provides in-order delivery
of frames. A Class 4 circuit includes at least one VC in each direction with a set of QoS parameters for each
VC. These QoS parameters include guaranteed transmission and reception bandwidths and/or guaranteed
maximum latencies in each direction across the fabric. When the request is made to establish the virtual
circuit, the request specifies the bandwidth requested, as well as the amount of latency or frame jitter
acceptable.
Bandwidth and latency guarantees for Class 4 virtual circuits are managed by the QoS Facilitator (QoSF), a
server within the fabric. The QoSF is at the well-known address x’FF FFF9’ and is used to negotiate,
manage, and maintain the QoS for each VC and assure consistency among all the VCs set up across the full
fabric to all ports. The QoSF is an optional service defined by the Fibre Channel Standards to specifically
support Class 4 service. Because the QoSF manages bandwidth through the fabric, it must be provided by a
Class 4-capable switch.
At the time the virtual circuit is established, the route is chosen and a circuit created. All frames associated
with the Class 4 virtual circuit are routed via that circuit insuring in-order frame delivery within a Class 4
virtual circuit. In addition, because the route is fixed for the duration of the circuit, the delivery latency is
Cascaded FICON in a Brocade environment
39 of 40
MAINFRAME
Technical Brief
deterministic. Class 4 has the concept that the VCs can be in a “dormant” state, with the VC set up at the
N_Ports and through the fabric but with no data flowing or a “live” state, where data is actively flowing.
To set up a Class 4 virtual circuit, the CircuiT Initiator (CTI) sends a QoS Request (QoSR) extended link
service command to the QoSF. The QoSF verifies that the fabric has the available transmission resources to
satisfy the requested QoS parameters, and then forwards the request to the CircuiT Recipient (CTR). If the
fabric and the recipient can both provide the requested QoS, the request is accepted and the transmission
can start in both directions. If the requested QoS parameters cannot be met, the request is rejected.
In Class 4, the fabric manages the flow of frames between node ports and the fabric by using the virtualcircuit flow control mechanism. This is a buffer-to-buffer flow control mechanism similar to the R_RDY FC
flow control mechanism. Virtual-circuit flow control uses the VC ready (VC_RDY) ordered set. VC_RDY
resembles FC R_RDY, but it contains a virtual circuit identifier byte in the primitive signal, indicating which
VC is being given the buffer-to-buffer- credit. Managing the flow of frames on ISLs must also support the
virtual-circuit flow control to manage the flow of Class 4 frames between switches.
Each VC_RDY indicates to the N_Port that a single Class 4 frame is needed from the N_Port if it wishes to
maintain the requested bandwidth. Each VC_RDY also identifies which virtual circuit is given credit to send
another frame. The fabric controls the bandwidth available to each virtual circuit via the frequency of
VC_RDY transmission for that circuit. One VC_RDY per second is permission to send 1 frame per second
(2 kilobytes per second if 2 K frame payloads are used). One thousand VC_RDYs per second is permission
to send 1,000 frames per second (2 megabytes per second if 2 K frame payloads are used). The fabric is
expected to make any unused bandwidth available for other live Class 4 circuits and for Class 2 or 3
frames, so the VC_RDY does allow other frames to be sent from the N_Port.
There are potential scalability difficulties associated with Class 4 service, since the fabric must negotiate
resource allocation across each of the 254 possible VCs on each N_Port. Also, Fabric Busy (F_BSY) is not
allowed in Class 4. Resources for delivery of Class 4 frames are reserved when the VC is established, and
therefore the fabric must be able to deliver the frames.
Class 4 is a very complex issue. For more detailed information, refer to Kembel’s Fibre Channel Consultant
series of textbooks. In addition, because of the complexity, Class 4 was never fully adopted as a standard.
Further work on it was stopped, and much of the language has been removed from the FC standard.
FC-FS-2 letter ballot comment Editor-Late-002 reflected the results of surveying the community for interest
in using and maintaining the specification for Class 4 service. Almost no interest was discovered. It was
agreed to resolve the comment by obsoleting all specifications for Class 4 service except the VC_RDY
primitive, which is used by the FC-SW-x standard in a way that is unrelated to Class 4.
Therefore, other mechanisms/models for QoS in FICON (FC) were considered, such as the method used by
InfiiniBand.
© 2008 Brocade Communications Systems, Inc. All Rights Reserved. 07/08 GA-TB-017-01
Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are registered trademarks and the Brocade B-wing symbol,
DCX, and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries.
All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or
services of their respective owners.
Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning
any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes
to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes
features that may not be currently available. Contact a Brocade sales office for information on feature and product availability.
Export of technical data contained in this document may require an export license from the United States government.
Cascaded FICON in a Brocade environment
40 of 40