Call Data Record Generation: Options and Considerations Executive Summary
Transcription
Call Data Record Generation: Options and Considerations Executive Summary
Call Data Record Generation: Options and Considerations An Industry Whitepaper Contents Executive Summary ................................... 1 Introduction to Call Data Record Generation .... 2 Standards-based vs. Industry-Standard......... 2 Standards-based Record Generation ............... 3 sFlow, NetFlow and IPFIX ......................... 3 sFlow .............................................. 3 NetFlow ........................................... 3 IPFIX ............................................... 4 Resource Utilization Monitoring .............. 5 Interface, Deployment and Performance Considerations ...................................... 5 Deployment Options ............................ 6 IPDR ................................................... 7 Freeform Record Generation........................ 8 How Use Case affects Transaction Rates ......... 9 Mediation for IPFIX+DPI vs. Freeform+DPI ..... 9 Executive Summary Network record generation offers information to help with understanding network usage and security risks, as well as metrics that can be used to optimize network performance, business systems, and quality of service. This paper explores the various method used to generate call data records, both standards-based and proprietary. A record generation solution focuses on extracting network traffic records by targeting IP flow information elements defined in open standards and/or proprietary user documentation. A collector or mediating element receives the stream of IP flow records where they can be processed into the desired output format and, for some use cases, forwarded on to another system. Whether standards-based or not, the choice of which method and deployment type to use when extracting data from the network depends on the specific use case and approach to record generation. In the end it all comes down to mediation – how the solution platform manages and processes an overwhelming landscape of data into a subset of targeted information for use by a downstream application. IPFIX .............................................. 10 Freeform......................................... 10 Conclusion ............................................. 10 Summary of Record Generation Techniques . 10 Related Resources ................................ 11 Version 2.0 Network Record Generation Introduction to Call Data Record Generation When communications service providers (CSPs) look at the tremendous amount of data that flows through their networks on a daily basis and think about extracting relevant records, the issue is clearly one of “big data”. CSPs extract basic Layer-3 information from their networks as call data records (CDRs). In today’s modern networks, the term call data record (CDR) is interchangeable with charging data record (CDR) or usage data record (UDR), especially when describing Layer-7 use cases affecting Internet applications. Big data can mean different things in different contexts, but for Internet service delivery the issue of record generation is about what to extract, how to extract it, and why the information is needed in the first place. The ability to generate call data records offers information to help with understanding subscriber usage and security risks, as well as metrics that can be used to optimize network performance, business systems, and quality of service. Depending on the implementation, these records can also be used for billing and auditing purposes in support of charging applications. At the highest level, a CSP extracts data records about network traffic for export to other systems in support of the following use cases: • • • • • Generating business intelligence reports for insight into network optimization Providing records in support of auditing (i.e., bill verification) Communicating usage updates to an offline charging system (OFCS) for post-paid charging Maintaining a parallel stream of usage records in case prepaid billing systems fail Network security monitoring and mitigation Whether standards-based or proprietary 1, a record generation solution focuses on extracting network traffic records by targeting IP flow information elements defined in open standards and/or proprietary user documentation. 2 A collector or mediating element receives the stream of IP flow records where they can be processed into the desired output format and, for some use cases, forwarded on to another system. The type of flow information contained in network records depends on the use case, and the following are some typical examples: • • • • • Subscriber ID Session ID Source and destination IP address Application type Device Type • • • • • Service type Rating group Vendor ID Total upstream bytes Total downstream bytes This paper focuses on the standards-based and freeform methods currently available for network record generation with an examination of the characteristics of each method. The choice of which method and deployment type to use depends on the specific use case and the impact its implementation will have on network performance and dimensioning. Standards-based vs. Industry-Standard “Standards-based” is defined in this paper as a protocol openly described in RFCs and IETF-endorsed documents. 3 Records generated from the network are often exported to external systems 4 and from there can take many forms both standards-based and proprietary. 5 1 Currently, the only solution with fully-integrated record generation uses a freeform policy model. An information element can be thought of as “a fact about a particular IP service flow at a particular point in time”. 3 Within this paper’s context, a standards-based protocol is an openly-described method of obtaining network records and does not necessarily have to be an industry standard. For example, NetFlow version 9 is not an industry standard protocol but is based 2 2 Network Record Generation Standards-based Record Generation A standards-based system of record generation employs a method of record extraction described in official documents often hosted or endorsed by industry standards bodies such as the IETF. A standardsbased system offers a predetermined set of configuration points to generate records for supported use cases. sFlow, NetFlow and IPFIX These three are the most common record generation protocols that are not specific to a particular access type (i.e., IPDR in cable) and leverage a described standard or RFC to configure both end points of the solution. sFlow Short for “sampled flow”, sFlow is a protocol for extracting packet records at Layer-2 of the OSI model, and is mainly used to sample information for basic monitoring use cases. The sFlow protocol is not an industry standard and was originally developed by InMon corporation 6, and is now sold as a feature in network transport equipment from many different manufacturers. 7 Unlike NetFlow and IPFIX, sFlow has no notion of service flows and only offers periodic sampling of flows (packets at layer-2) and samples of counters (periodic time-based measurement). NetFlow NetFlow is a protocol for generating flow records originally developed by Cisco Systems as a caching system, and is now widely used to collect statistics on IP traffic information. NetFlow has never been an official industry standard 8; however, many would agree that NetFlow has become the unofficial industry standard due to its widespread use. Version 9 of NetFlow RFC 3954 9 forms the basis for the IPFIX protocol which is described in RFCs as an IETF standard. There are also many NetFlow equivalents sold as proprietary features in network transport equipment from several vendors. 10 In the most common implementation, when enabled and configured on a switch or router, NetFlow collects statistics on the IP traffic passing through that device. The flow data can then be exported to a mediation/collection system. NetFlow Record Information As the name suggests, NetFlow aggregates packet statistics to report specifically on IP flow information such as source and destination IP addresses, IP protocol, source and destination ports, and Type of Service. Additional information can be tracked per flow, including Inbound interface and up to 79 other field types for information elements described in RFC 3954. 11 on the open informational document RFC 3954, which fully describes the parameters, configuration and limitations of the protocol for use by third parties, making NetFlow a “standards-based” protocol. 4 Records can also be used for real-time traffic management use cases such as QoS control and network security. 5 For example, as a standardized output format such as CSV (comma-separated values) or as graphs in a proprietary GUI that displays network business intelligence reports. 6 See opening comments from RFC 3176. 7 Wikipedia offers a comprehensive list of sFlow vendors. 8 See opening comments from RFC 3954. 9 An RFC that describes a standard that isn’t endorsed is often called an “informational” document. 10 Wikipedia offers a comprehensive list of what it calls NetFlow equivalents. These are essentially proprietary implementations of the NetFlow standard. 11 See section 5 of RFC 3954. 3 Network Record Generation A NetFlow record reports a wealth of information about traffic in a given flow 12 in the purely standards-based implementation up to Layer-3. NetFlow version 9 has extensibility to include fields not described in the RFC. This provides an opportunity to support use cases not covered by the official protocol description. However, a custom solution configuration and interoperability effort comparable to a freeform solution, since the “custom” aspect is often a proprietary method developed in-house or an outside vendor. 13 Wikipedia offers the following example of a NetFlow record showing three flows: Src IP addr. | Dst IP addr. | Number | Number 198.168.1.12 | 10.5.12.254 192.168.1.27 | 10.5.12.23 192.168.1.56 | 10.5.12.65 | Next Hop addr. | Packet | Bytes | 192.168.1.1 | 192.168.1.1 | 192.168.1.1 | 5009 | 748 | 5 | 5344385 | 388934 | 6534 Figure 1 – Example NetFlow record IPFIX The IP Flow Information eXport protocol (IPFIX) is the first common and universal industry standard of export for Internet flow information from routers, probes and other transport devices, and is defined by several RFCs 14. IPFIX Record Information Based on the NetFlow protocol, the field export format described in RFC 3954 has evolved with IPFIX into the 238 Information Element field types defined in the RFC standards documents. Many of these field types are defined as “reserved” to maintain compatibility with NetFlow version 9 and must be referenced in RFC 3954 15. IPFIX has the extensibility of NetFlow version 9 with a vendor ID information element that specifies a custom application of the IPFIX protocol. The following example record is provided by Wikipedia: Source Destination Packets -----------------------------------------192.168.0.201 192.168.0.1 235 192.168.0.202 192.168.0.1 42 12 See Wikipedia for an overview of the Internet OSI model. Detailed information about Internet traffic is configured using the fields described in RFC 3954. 14 RFCs 3917, 5101 and 5102 and 5103, 5472, 7011 – 7015. 15 IPFIX can be thought of as backwards compatible to NetFlow version 9. 13 4 Network Record Generation Figure 2 – Example IPFIX record Resource Utilization Monitoring NetFlow and IPFIX are accurate enough to perform resource utilization monitoring, but they cannot be solely relied upon in charging applications for end user billing and, by extension, bill verification and auditing. 16 The standards do not offer sufficient resiliency and safeguards to ensure reliable data export to meet the billing accuracy requirements described in RFC 3917 17. Interface, Deployment and Performance Considerations Enabling purely standards-based record generation on a device is usually a simple configuration change on the network transport device requiring very little effort. Record generation with sFlow, NetFlow and IPFIX is typically undertaken by routers and switches as part of the production network. The sFlow protocol is designed by nature to have a minimal impact on network transport equipment because it only samples IP flows periodically in support of basic monitoring use cases. NetFlow and IPFIX do have an impact on the devices where they are enabled, with the severity depending on the specific use case(s). The processor and memory load can cause 16 17 See section 4.2 of RFC 5472. See section 5 and 6 of RFC 3917. 5 Network Record Generation severe service degradation, normally measured as an increase in the device CPU and memory utilization to track and report on specific metrics. 18 NetFlow and IPFIX can be enabled on a per-interface basis to limit load on transport elements. IP filters can also limit which packet types can be observed by NetFlow to further reduce the strain. To further reduce the performance impact, Cisco introduced a sampling feature for NetFlow on certain transport equipment products. 19 IPFIX also allows sampling and adds the ability to specify variable length fields. Deployment Options When a CSP wants to generate records from the network, the most common implementation is to use pre-existing internet data transport devices such as routers and switches to gather and forward metrics to one or more monitoring stations for offline processing as shown by Figure 3 20. Figure 3 – Standards-based record generation using switches and routers However, as shown by Figure 4 there is another interface approach where an offline device observes or taps the network data flow and then generates records using the NetFlow or IPFIX protocol standards. This has the obvious advantage of avoiding a performance hit to network transport elements, although it requires a separate offline element that only generates or displays records with no ability to directly manage traffic 21. The offline element may also serve as the collector device that processes the records into a desired reporting format. Figure 4 – Standards-based record generation using network tap and offline element 18 See the Wikipedia entry for NetFlow, and in particular the talk page for the NetFlow entry. See Sampled NetFlow in the Wikipedia entry. 20 The Mediator/Collector element can be integrated with operational support and billing systems. 21 All of the things you might want to do with a PCEF or TDF element for policy control, for example. 19 6 Network Record Generation Figure 5 shows a third implementation where NetFlow records are fed from transport elements to an inline data plane device directly or to a control plane device that can signal the inline device to perform traffic management. 22 Inline devices offer the ability to directly manage traffic based on realtime NetFlow record information for such use cases as high-level QoS control and network security monitoring and mitigation. Figure 5 – Standards-based record generation for inline element Record Mediator/Collector In all three deployments shown above there must be a mediator/collector element that processes records into one or more final formats. IPDR An IP Detail Record (IPDR) provides information about IP-based service usage and other activities that can be used by Operational Support Systems (OSS) and Business Support Systems (BSS). IPDR is overseen by the TM Forum, a non-profit industry standards organization primarily for service providers working with cable networks. The IPDR specifications include requirements for record collection, encoding, and the transport protocols used to exchange IPDR records. According to specifications, IPDR can be used for business intelligence reporting, network configuration, health monitoring, service assurance and bill resolution. Figure 6 – IPDR Record Generation in Cable Networks 22 This includes Policy Charging and Control (PCC) implementations. In a PCC implementation the mediator element could be a DPI-based PCEF or TDF and the collector element could be a PCRF. The inline element cannot typically act as the collector for practical performance reasons (i.e., the CPU is needed elsewhere). 7 Network Record Generation Freeform Record Generation The most obvious example of a freeform record generation solution is one that leverages existing DPIbased elements that intersect network traffic and uses a proprietary model completely separate from the typical standards-based description. Such elements, such as a PCEF or TDF, meter traffic according to strictly laid PCC standards. Such elements offer a direct link to deep information, including application data, to create and export network records up to Layer-7. This approach moves concerns about performance and dimensioning from transport elements to the DPI element. As shown by Figure 7, the mediation function is subsumed into the intersecting device, which generates records directly from the data stream that can be exported for various use cases. As with the inline deployment shown in the previous section, the same information that is used to generate records can also be used to perform real-time traffic management functions such as QoS control and network security. However, since the records are generated from the PCC standards-defined metering that ensures accurate charging, they can be used in support of bill verification use cases. Figure 7 – Inline, DPI-based Record Generation with PCEF & PCRF elements Freeform Record Information Given the right policy model, a freeform solution offers the ability to freely configure records and leverage the pre-existing exposure of detailed Layer-7 application information for record generation use cases. The usage data records (UDRs) for freeform record generation do not follow a binary format as seen with the NetFlow and IPFIX examples. Instead, a compressed CSV human-readable format is used, which is much simpler to manipulate and consume by IT systems. Because the solution is not based on any written standard, some effort is required to configure its operation by referencing proprietary documentation. However, since record extraction and output are highly configurable, what would be considered “custom records” in a NetFlow environment require no special effort when using freeform policy with fully-integrated data record generation. Figure 8 shows an example of custom records to collect subscriber volume usage (total, sent and received) on a per-session basis for bill dispute resolution related to postpaid charging. The freeform script language specifies Information Elements for extraction in flow records that, in this example, are efficiently grouped by subscriber. 8 Network Record Generation RecordType,RecordStatus,RecordNumber,StartTime,EndTime,AcctSessionId,Subscrib erId,FramedIp,ServiceId,TotalBytes,TransmittedBytes,ReceivedBytes [For IP 72.12.156.99]: session_start,0,0,2011:3:25:16:44:14,,1208786019~130108585,001311B8A12E,72.12 .156.99,[0],0,0,0,0 usage_start,0,0,2011:3:25:16:44:15,2011:3:25:16:44:15,1208786019~130108585,00 1311B8A12E,72.12.156.99,[30],88,0,88,0 usage_stop,4,*,2011:3:25:17:42:54,2011:3:25:17:42:54,1208786019~130108585,001 311B8A12E,72.12.156.99,[30],0,0,0,0 session_stop,2,*,2011:3:25:16:44:14,2011:3:25:17:42:54,1208786019~130108585,0 01311B8A12E,72.12.156.99,[0],0,0,0,0 [For IP 72.12.141.157]: usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143, 001596260DCC,72.12.141.157,[5],604,132,472,0 usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143, 001596260DCC,72.12.141.157,[5],66,0,66,0 usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143, 001596260DCC,72.12.141.157,[5],76,76,0,0 usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143, 001596260DCC,72.12.141.157,[5],76,76,0,0 usage_int,0,24445,2011:3:25:16:44:34,2011:3:25:16:44:34,1208782237~130108143, 001596260DCC,72.12.141.157,[5],66,66,0,0 usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001 596260DCC,72.12.141.157,[9],0,0,0,0 usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001 596260DCC,72.12.141.157,[5],0,0,0,0 usage_stop,4,*,2011:3:25:17:43:15,2011:3:25:17:43:15,1208782237~130108143,001 596260DCC,72.12.141.157,[30],0,0,0,0 session_stop,2,*,2011:3:25:15:30:39,2011:3:25:17:43:15,1208782237~130108143,0 01596260DCC,72.12.141.157,[0],0,0,0,0 Figure 8 – Example records created through custom generation solution How Use Case affects Transaction Rates As noted at the beginning of this paper, a CSP’s ability to extract desired data from the network will depend on the specific use case and approach to record generation. In the end it all comes down to mediation – how the solution platform manages and processes an overwhelming landscape of data into a subset of targeted information for use downstream. 23 Use cases become much more interesting when a DPI device is integrated to support data records for Layer-7. In the case of NetFlow and IPFIX, the DPI solution uses the extensibility option in the standards to create a custom set of Layer-7 records. With a proprietary, freeform solution, the DPI solution supports fully-integrated data records built directly out of the product framework. 24 Mediation for IPFIX+DPI vs. Freeform+DPI Consider the following scenario: A mobile operator wants to generate subscriber-based Layer-7 records with mobile device information where the average packet core traffic is 2Gbps. Raw records are 23 For a full exploration of this issue, see the Vanilla Plus article Style, substance and big data. In other words there is no interoperability or custom standards work to make records work with the existing set of Layer-7 records – such features are built into the product. 24 9 Network Record Generation processed by a mediation system into a final output format for a customer experience management (CEM) solution. Let’s examine the transaction rates and raw record output for a solution that uses a DPI element following the IPFIX approach versus a proprietary solution where the DPI element uses freeform policy to group all flows by subscriber in its state engine. IPFIX Since IPFIX is essentially a flow-based record reporting standard, one would expect one record to be generated per flow (5-tuple, Layer-3). The concept is that IPFIX reports information on all active flows within a pre-defined interval – in this case let’s make the interval 60 seconds. The IMEI data point which identifies a specific subscriber device would be populated using the standard’s extensibility feature to create the desired record. To set a baseline, assume 1Gbps of mobile network data carries about 7,000 flows per second. In this case the solution would generate about 840,000 records per minute, or 12,600,000 records every 15 minutes, with each record providing 5- tuple, Layer-3 information with the addition of the IMEI custom field. Over a 24 hour period the solution would generate 1.2 billion raw records for processing by the mediation system. Freeform A freeform record generation solution that can generate one record per subscriber and device does not need to generate a record for every IP flow because it generates records for flows grouped by subscriber, not simply by IP source and destination pairs. In this case, 1Gbps of network throughput would typically indicate about 500,000 subscribers. Since the records generated are tied to individual subscribers rather than every single IP flow, over a 24 hour period about 48,000,000 raw records are generated for processing by the mediation system. Conclusion At the highest level, a CSP extracts data records about network traffic for export to other systems in support of one or more of the following use cases: • • • • • Generating business intelligence reports for insight into network optimization Providing records in support of auditing (i.e., bill verification) Communicating usage updates to an OFCS for post-paid charging Maintaining a parallel stream of usage records in case prepaid billing systems fail Network security monitoring and mitigation When choosing a record generation solution, CSPs must weigh the desired quality and quantity of information against the cost of implementation in terms of performance and dimensioning. Summary of Record Generation Techniques The following table summarizes the details of the record generation techniques presented by this paper: 10 Network Record Generation Technique Standards-based, transport equipment record generation Standards-based, offline record generation Standards-based, inline, DPI-based, record generation Description -Easy to enable and configure -Transport element dimensioning considerations -Transport element performance impacted -Generally easy to enable and configure -Some custom work, proprietary documentation -Requires additional element -Offline element dimensioning considerations -No transport element performance impact -Large effort to enable and configure -A lot of custom work, proprietary feature configuration -DPI and transport element dimensioning consideration -Requires inline element -Can leverage existing DPI solution or PCC setup -Transport element performance impacted Sandards-based, IPDR record generation -Specific to cable networks -Transport element dimensioning consideration -Transport element performance impacted Proprietary, inline, DPIbased record generation -Proprietary feature configuration -Requires inline element -Inline element dimensioning consideration -Leverages existing DPI solution or PCC setup -No transport element performance impact Use Cases -Reporting up to Layer-3 -Reporting up to Layer-3 -Reporting and use cases up to Layer-7 -Network security -QoS control -Reporting up to Layer-? -Service assurance -Configuration & monitoring -Bill auditing/verification -Reporting and use cases up to Layer-7 -Network security -QoS control -Charging support -Bill auditing/verification Related Resources See the Sandvine technology showcase Meaningful Data Records with Minimal Overhead. Please also see the Sandvine technology showcase SandScript - The Advantage of Freeform Policy. 11