Survivability in IP Centrex and Distributed IP-PBX
Transcription
Survivability in IP Centrex and Distributed IP-PBX
Survivability in IP Centrex and Distributed IP-PBX Environments August 08 Table of Contents Introduction 3 The Survivability Challenge – Overcoming the “Single Point of Failure” trap 4 IP-Centrex and Distributed IP-PBX Survivability – How it works? Internal office VoIP communication survivability External office communication survivability Recovery from network failure Delving into technical details Implementing an Active SAS mechanism Implementing a Passive SAS mechanism Active mode vs. Passive mode – Pros and Cons 5 5 5 5 6 6 7 9 The Five Star Approach Standards Based SAS Implementation Scalable Solution - Growing with your Enterprise Redundancy Mechanism for Survivability Applications Advanced Support for Emergency Calls (e.g. 911) Maintain Maximum Functionality Level in Emergency Mode 9 9 9 10 10 10 About AudioCodes Mediant 1000 Multi Service Business Gateway (MSBG) 10 AudioCodes Stand Alone Survivability powered by the Mediant 1000 - MSBG 10 About AudioCodes 11 Introduction With the evolution of the VoIP market, IP Centrex services and IP-PBX applications have evolved to become two of the predominant offers for enterprises of all sizes. This market trend is driven by the well known cost-effectiveness of IP-based solutions which steadily provides a richer set of features and functionalities to end users, whereas traditional PBX’s and legacy telephony are lagging way behind. This innovative technology creates new challenges which may be perceived as significant drawbacks by end users. This document discusses one of the main challenges that should be meticulously addressed - the survivability problem of the IP Centrex and distributed IP-PBX architectures, which are an outcome of the inherent dependency of the entire enterprise telephony system on a single central server. This document also presents the “Five Star Approach” - five fundamental guidelines and capabilities that the enterprise IT manager should require from its IP-Centrex / IP-PBX service provider for ensuring survivability and creating a reliable IP-based telephony network with comparable or even superior survivability features than the legacy telephony network (PSTN). In order to have a better understanding of these challenges, let us start with a brief definition of the services themselves. •IP Centrex (Central Exchange) – like the traditional Centrex, IP Centrex is a solution hosted by telecommunication service providers which includes call control and service logic functionalities. Unlike the IP-PBX solution, the customer does not own the IP Centrex servers which are located on the service provider’s premises. The IP Centrex solution is best suited for small-medium enterprises that typically lack internal staff and expertise for maintaining the features and capabilities offered by this service. Figure 1 - Typical IP Centrex Topology N •IP-PBX – The IP-PBX is a customer premise-based VoIP solution where all the solution’s logic and components are self owned by the customer. IP-PBX is a suitable solution for the medium to large enterprise telecommunication requirements. Distributed IP-PBX is an architecture by which the enterprise locates a central IP-PBX server in its headquarters and provides telephony services to its distributed branches. Figure 2 - Typical Distributed IP-PBX Topology N 3 •Stand Alone Survivability (SAS) – An application agent which is installed in the enterprise headquarters and/or branch office premises to ensure continuous telephony service during network outage scenarios such as WAN failure, thus preventing communication with the central server. To ensure full coverage of telephony services during a network failure, a SAS agent should be installed in each one of the enterprise offices. The Survivability Challenge – Overcoming the “Single Point of Failure” trap This section describes the motivation behind the enterprise survivability requirements for IP Centrex and IP-PBX services. The traditional telephony service, based on the legacy PSTN network, has placed high standards of stability, survivability and sustainability capabilities throughout more than 100 years of operation. The same level of service survivability is typically expected from any IP-based solution. Allegedly one of the major barriers towards achieving this goal is the IP Centrex and Distributed IP-PBX’s unique deployment architecture which inherently includes components that potentially might become a single point of failure. In simple terms, any single piece of equipment experiencing a failure event can bring the entire telephony service operation to a halt. Both IP Centrex and IP-PBX topologies suffer from a number of such potential single points of failure. Typically the IP Centrex and IP-PBX servers are typically the only components in the network that maintain the entire telephony system topology structure. These servers register all the IP-Phones and Media Gateways in the network; maintain a complete and updated snapshot of the telephony network and manage the call control by addressing calls to and from enterprise phones, based on an internal routing database. In case of failure in the IP Centrex or IP-PBX servers, the enterprise will lose its internal telephony service at any branch, between its offices, as well as with the external environment. In fact the enterprise loses its main communication tool and all contact with the external environment, rendering it useless. A similar crisis may occur in the case of failure in the corporate WAN and problems with an access device such as the DSL modem. The enterprise will lose its connection with the IP Centrex or the IP-PBX and as a result, enterprise telephony capabilities will become unavailable. Additionally, one of the worst consequences of a breakdown in telephony services is the inability to generate emergency calls (e.g. 911), since this type of dialling requires a PSTN breakout which is located in the central office where the IP-PBX or IP Centrex server is located. During a failure scenario, this connection is lost. This issue must be carefully resolved before replacing the traditional telephony network with IP-based telephony technology. Figure 3 - Single Point of Failure in IP Centrex Environment N 4 IP Centrex and Distributed IP-PBX Survivability – How does it works? This section will describe a high level view about the technical aspects of a typical SAS application. Internal Office VoIP Communication Survivability As previously described in this paper, the IP Centrex or IP-PBX applications keeps an updated snapshot of the network structure and solely controls the telephony in the enterprise. In order to terminate this monopoly and protect against the single point of failure, SAS agents are installed in each one of the enterprise offices which are served by the remote IP Centrex or IP-PBX. The SAS agent acts as an outbound Proxy and Registrar which registers all IP-Phones and Media Gateways in the network and simultaneously transfers the registration transactions to the main IP Centrex or IP-PBX server. This behaviour ensures automatic redundant registration for all telephony devices in the office network since the SAS agent maintains a complete and updated copy of the telephony network structure. Taking advantage of this capability, in the case of central server failure, the SAS agent assumes the lead to handle all call controls within the protected enterprise branch. External Office Communications Survivability Once the intra-branch telephony connection failure is overcome, we can analyze how the SAS agent handles the inter-branch connectivity and connectivity with the external world during IP Centrex or centralized IP-PBX failure. To answer this challenge, the SAS agent should be equipped with an integrated fallback mechanism to the local legacy PSTN network. Using this feature, once the SAS agent observes a connectivity problem with the central IP Centrex / IP-PBX server (indicated by a loss of connection with this server), it automatically redirects all the outbound calls towards the PSTN network and begins forwarding inbound calls from PSTN lines to telephone devices in the office. Utilizing its integrated fallback mechanism, the SAS agent maintains complete call control and media connectivity to and from the external telephony network. Recovery from Network Failure During the “emergency” operation mode, the SAS agent should constantly monitor the network connectivity and once the connections with the IP Centrex / IP-PBX server are resumed, it will switch back to the normal mode, whereby control is provided by the IP Centrex / IP-PBX servers. Figure 4 - IP Centrex Application 3 N 5 Figure 5 - Distributed IP-PBX Application In case of failure in one of the network elements, the SAS agent will take the lead and handle all the intra-call control management as well as the external (PSTN) call control and media connectivity via a built-in fallback mechanism. Delving into technical details This section will describe in more detail the technical aspects of the solutions suggested previously, assuming the solution’s implementation is based on SIP (Session Initiation Protocol), therefore basic understanding of this control protocol is recommended. Readers who do not want to dive into the technical aspects of this implementation may skip to the next section: The “Five Star Approach”. There are two main alternative implementations for the SAS application: the active and passive models. Implementing an Active SAS mechanism IP-Phone Configuration For the sake of SAS implementation, the primary proxy server (S1) for the IP-Phone is the SAS agent while the IP Centrex or IP-PBX is defined as the secondary proxy server (S2). IP-Phones should be able to automatically switch between the primary server (SAS agent), and the secondary server (IP Centrex/ IP-PBX server) in the case of failure. It is also expected to switch back to the original configuration when connection and service are restored. Active SAS Mechanism during a Normal Operation Mode The installed SAS agent acts as an outbound Proxy and Registrar which registers all IP-Phones and Media Gateways in the network, and simultaneously transfers these registrations to the central IP Centrex / IP-PBX server. In fact, all IP-Phones are first registered with the SAS agent and only later the SAS agent registers them with the IP Centrex / IP-PBX server. This process ensures automatic redundant registration for all the telephony devices in the enterprise network - since the SAS agent maintains a full and updated copy of the telephony network structure. The Call Setup and Call Termination messages are sent first to the SAS agent which transfers them automatically to IP-Centrex / IP-PBX and vice versa1. On the other hand, the media session is established directly between the IP Centrex / IP-PBX server and the IP-Phones, bypassing the SAS application and avoiding a traffic bottle-neck. During normal operation mode, when the IP Centrex / IP-PBX is available and provides telephony services, the SAS application should constantly monitor the IP Centrex / IP-PBX availability using Standard SIP messages. Simultaneously, IP-Phones should monitor the SAS agent’s (S1) availability and be able to switch to their secondary server (S2) which has been predefined as the IP Centrex / IP-PBX server. This double monitoring ensures that the SAS agent will not become a single point of failure since the IP-Phone will detect its failure and switch automatically to the IP Centrex / IP-PBX servers when needed. This is the most common architecture; however other implementations are valid as well. For example in some SAS implementations all the call control messages are directly being transmitted by the IP Centrex / IP-PBX directly to the IP-Phone, actually bypassing the SAS server in non-emergency operation mode. 1 6 Active SAS Mechanism during Emergency Operation Mode Once the SAS agent detects a communications failure with the IP Centrex / IP-PBX server, it switches into an emergency mode. During this emergency mode the SAS agent assumes the lead and handles all call setup and termination inside the enterprise as well as handling the connectivity with the external environment via its integrated PSTN fallback mechanism (analog or digital). Usually, the level of the telephony services provided by the SAS agent is limited and mostly includes basic call flows. During the emergency mode, the SAS agent constantly monitors the network connectivity with the IP Centrex / IP-PBX and once the connection resumes, it switches back to the normal mode in which control is provided by the IP Centrex / IP-PBX servers. Figure 6 - Active SAS Implementation Model Implementing a Passive SAS Mechanism IP-Phone Configuration Like the Active Implementation model, the Passive model also assumes that IP-Phones in the enterprise support automatic switching between alternative proxy servers, in the case of a failure in one of these devices. However, in the passive model, the primary proxy server (S1) for the IP-Phone is the IP Centrex / IP-PBX while the SAS agent is defined as the secondary proxy server (S2). This means that IP-Phones should be able to switch between the primary server (IP Centrex / IP-PBX), and the secondary server (SAS) in the case of a failure. Once again - it is also expected to switch back to the original configuration when the service is restored. 7 Passive SAS Mechanism during Normal Operation Mode In the passive implementation model the SAS agent is not involved in the call control for the duration of the normal operation mode. Registrations as well as other Call Control messages are sent directly from IP-Phones to the IP Centrex / IP-PBX and vice versa. Media sessions are also established directly between the IP Centrex / IP-PBX server and the IP-Phone, bypassing the SAS agent. During regular operational mode, IP-Phones should monitor the IP Centrex/IP-PBX server availability and have the ability to switch to their secondary server which is predefined as the SAS agent. Passive SAS Mechanism during Emergency Operation Mode Once the IP-Phone observes a communications failure with the IP Centrex / IP-PBX server, it switches to the SAS agent as a secondary proxy server and registers. This registration process is necessary since IP-Phones were not previously registered in the SAS agent. From this stage on, the SAS agent functions quite similarly to the Active SAS mechanism model. The SAS agent takes the lead and deals with all call control inside the enterprise as well as handling connectivity to the external environment by utilizing the integrated PSTN fallback capabilities (analog or digital). Once again, the level of the telephony services provided by the SAS agent is limited and mostly includes only basic call flows. Once the IP Centrex / IP-PBX becomes available again, the SAS agent automatically observes it and by using advanced SIP message manipulations, “forces” the IP-Phones to switch back to their primary server (S1) bypassing the SAS agent (S2). Figure 7 - Passive SAS Implementation Model 8 Active Mode vs. Passive Mode – Pros and Cons The main advantage of the Passive mechanism model is its relatively high level of security compared to the Active mechanism model. In the Passive mechanism model during normal operational mode, the SAS agent is not involved in the message exchange; therefore fewer components are exposed to the network structure and call control messages. Generally there is a reduced risk of information snooping and eavesdropping. This is a critical feature for enterprises and institutes which are sensitive to these types of security threats and demand a maximum level of protection for their telephony network. The main drawback of this implementation model is the relatively long switch time between the normal and emergency operation modes. All IP-Phones must register with the SAS agent once entering the emergency mode, therefore might take some time until the SAS agent creates a complete picture of the telephony network in the office. Evidently, a quick switch time is the main advantage of the active SAS model. The “Five Star Approach” to Guidelines for Survivable IP-Centrex This section details AudioCodes “Five Star Approach” for survivable IP Centrex / IP-PBX services. Standards Based SAS Implementation Why shouldn’t you redesign your network to support survivability? The primary requirement from any survivability mechanism is to be fully based on standard protocols such as the commonly used SIP to ensure the very basic level of integration within the customer’s existing VoIP network. Implementing the SIP based survivability mechanism, guarantees interworking with many IP Centrex / IP-PBX servers as well as IP-Phones and VoIP Gateways. This certainly enables the customer to have the freedom to choose its VoIP implementation, components and service provider. Scalable Solution - Growing with your Enterprise Can your survivability application scale up with the rest of your telephony network? Looking ahead, enterprises should consider a possible expansion of their telephony network which will entail an extension of the survivability services. In order to protect their initial investment, the originally deployed survivability application should be designed as a scalable solution enabling expansion of its services in the future. Lack of these features will pressurize the customer to replace their original survivability equipment and re-invest in a new server with a higher performance level. An efficient, scalable solution should enable system upgrades either by means of software updates or the installation of an additional survivability application including the original one, thereby extending the number of protected users in an enterprise. 9 Redundancy Mechanism for Survivability Applications What happens if Murphy’s Law strikes twice? What happens if the survivability application ceases while in “emergency” mode? This question can be addressed by deploying a redundant survivability application in customer premises. This redundancy mechanism should maintain telephony services in the event that one of the original survivability applications halts during an emergency scenario. Advanced Support for Emergency Calls (e.g. 911) Safety is paramount! The most elementary expectation from any survivability application is to enable emergency calls from enterprise headquarters or its remote branches. In addition, the survivability mechanism should be able to detect such calls and prioritize them accordingly, even if it means to tear down other “regular” calls. Prioritizing emergency calls is essential during “emergency” mode since occasionally survivability applications do not maintain the same voice session throughput as compared to the original IP-PBX or IP Centrex servers. Maintain Maximum Functionality Level in Emergency Mode Should we be content with basic call connectivity during emergency operational mode? Sometimes it is not enough to have just basic calls during emergency scenarios as other functionalities are required and may even be considered mandatory. For example, during normal operational mode, telephone numbers within the enterprise can be reached by dialling only their suffix extensions instead of dialling the complete number. These services are critical for the enterprise’s normal telephony services and should be supported by the survivability application during network failure. About AudioCodes Mediant™ 1000 Multi-Service Business Gateway (MSBG) The Mediant™ 1000 MSBG is an all-in-one multi-service access solution for service providers offering managed services and distributed enterprises. This multi-service business gateway is designed to provide converged voice and data services for business customers at wire speed while maintaining SLA parameters and superior voice quality. The Mediant 1000 MSBG is a direct evolution of the “field-proven” and highly interoperable Mediant 1000 VoIP media gateway. This new and innovative product allows a rich variety of benefits for Service Providers, Business Customers, Original Equipment Manufacturers (OEM), and Value Added Services (VAS) developers. Along with its best-of-breed Media Gateway functionality, the Mediant 1000 MSBG provides Data Routing, WAN Access, and Session Border Controller (SBC) functionalities in a single cost-effective hardware platform. Utilizing its built-in Intel processor based server (the OSN), the Mediant 1000 MSBG can accommodate a variety of 3rd party applications such as IP-PBX, Call Center, Conferencing Server, Fax Termination application and more. Mediant 1000 MSBG AudioCodes Stand Alone Survivability powered by the Mediant 1000 MSBG In addition to the above features the Mediant 1000 MSBG provides advanced survivability services as part of the Stand Alone Survivability (SAS) feature, designed and developed particularly for the IP Centrex and IP-PBX environments. SAS is an application agent that is installed in an enterprise’s headquarters and/or branch office premises to ensure continuous telephony service during network outage scenarios such as WAN failure, thus preventing communication with the central server. AudioCodes SAS adheres to the “Five Star Approach” guidelines, and provides reliable and continuous telephony services. For more information about the Mediant 1000 MSBG and the SAS feature, please visit: www.audiocodes.com/msbg 10 About AudioCodes AudioCodes Ltd. (NasdaqGS: AUDC) provides innovative, reliable and cost-effective Voice over IP (VoIP) technology, Voice Network Products, and Value Added Applications to Service Providers, Enterprises, OEMs, Network Equipment Providers and System Integrators worldwide. AudioCodes provides a diverse range of flexible, comprehensive media gateway, and media processing enabling technologies based on VoIPerfectTM -- AudioCodes’ underlying, best-of-breed, core media architecture. The company is a market leader in VoIP equipment, focused on VoIP Media Gateway, Media Server, Session Border Controllers (SBC), Security Gateways and Value Added Application network products. AudioCodes has deployed tens of millions of media gateway and media server channels globally over the past ten years and is a key player in the emerging best-of-breed, IMS based, VoIP market. The Company is a VoIP technology leader focused on quality and interoperability, with a proven track record in product and network interoperability with industry leaders in the Service Provider and Enterprise space. AudioCodes Voice Network Products feature media gateway and media server platforms for packet-based applications in the converged, wireline, wireless, broadband access, cable, enhanced voice services, video, and Enterprise IP Telephony markets. AudioCodes’ headquarters and R&D are located in Israel with an additional R&D facility in the U.S. Other AudioCodes’ offices are located in Europe, India, the Far East, and Latin America. International Headquarters 1 Hayarden Street, Airport City Lod 70151, Israel Tel: +972-3-976-4000 Fax: +972-3-976-4040 AudioCodes Inc. 27 World’s Fair Drive, Somerset, NJ 08873 Tel:+1-732-469-0880 Fax:+1-732-496-2298 Contact us: www.audiocodes.com/info Website: www.audiocodes.com ©2008 AudioCodes Ltd. All rights reserved. AudioCodes, AC, Ardito, AudioCoded, NetCoder, TrunkPack, VoicePacketizer, MediaPack, Stretto, Mediant, VoIPerfect and IPmedia, OSN, Open Solutions Network, What’s Inside Matters, Your Gateway To VoIP, 3GX, InTouch, CTI2, CTI Squared, Nuera and Netrake are trademarks or registered trademarks of AudioCodes Limited. All other products or trademarks are property of their respective owners. Ref # LTRM-80027 08/08 V.1 11