Accelerating the bit-split string matching algorithm using Bloom filters

Transcription

Accelerating the bit-split string matching algorithm using Bloom filters
Computer Communications 33 (2010) 1785–1794
Contents lists available at ScienceDirect
Computer Communications
journal homepage: www.elsevier.com/locate/comcom
Accelerating the bit-split string matching algorithm using Bloom filters
Kun Huang a, Dafang Zhang b,c,*, Zheng Qin c
a
Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, PR China
School of Computer and Communication, Hunan University, Changsha, Hunan Province 410082, PR China
c
School of Software, Hunan University, Changsha, Hunan Province 410082, PR China
b
a r t i c l e
i n f o
Article history:
Received 19 July 2008
Received in revised form 26 April 2010
Accepted 28 April 2010
Available online 5 May 2010
Keywords:
Intrusion detection
String matching
Bloom filter
Bit-split
Byte filter
a b s t r a c t
Deep Packet Inspection (DPI) is essential for network security to scan both packet header and payload to
search for predefined signatures. As link rates and traffic volumes of Internet are constantly growing,
string matching using Deterministic Finite Automaton (DFA) will be the performance bottleneck of
DPI. The recently proposed bit-split string matching algorithm suffers from the unnecessary state transitions problem. The root cause lies in the fact that the bit-split algorithm makes pattern matching in a ‘‘not
seeing the forest for trees” approach, where each tiny DFA only processes a b-bit substring of each input
character, but cannot check whether the entire character belongs to the alphabet of original DFA.
This paper presents a byte- filtered string matching algorithm, where Bloom filters are used to preprocess each character of every incoming packet to check whether the character belongs to the original
alphabet, before performing bit- split string matching. If not, each tiny DFA either makes a transition
to its initial state or stops any state transition. Experimental results demonstrate that compared with
the bit- split algorithm, our byte-filtered algorithm enormously reduces the string matching time as well
as the number of state transitions of tiny DFAs on both synthetic and real signature rule sets.
Ó 2010 Elsevier B.V. All rights reserved.
1. Introduction
Computer networks are vulnerable to enormous break-in attacks, such as worms, spyware, and denial-of-service attacks
[1,2]. These attacks intrude publicly accessible computer systems
to eavesdrop and destroy sensitive data, and even hijack a large
number of victim computers to launch new attacks. Hence, there
has been widespread research interest in combating such attacks
at every level, from end hosts to network taps to edge and core
routers.
Network Intrusion Detection and Prevention Systems (NIDS/
NIPS) have emerged as one of the most promising security components to safeguard computer networks. The worldwide NIDS/NIPS
market is expected to grow from $932 million in 2007 to $2.1
billion in the next five years [3]. NIDS/NIPS, such as Snort [4] and
Bro [5], perform real-time traffic monitoring to identify suspicious
activities. Deep Packet Inspection (DPI) is the core of NIDS/NIPS,
which scans both packet header and payload to search for predefined attack signatures. To define suspicious activities, DPI typically uses a set of rules which are applied to matching packets. A
rule consists at least of a type of packet, a string of content (called
signature string) to match, a location where that string should be
* Corresponding author.
E-mail addresses: [email protected] (K. Huang), [email protected]
(D. Zhang), [email protected] (Z. Qin).
0140-3664/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.comcom.2010.04.043
searched for, and an associated action to take if all the conditions
of the rule are met. DPI has been also widely used in a variety of
other packet processing applications, such as Linux Layer-7 filter
[6], P2P traffic identification [7], context-based billing, etc.
As link rates and traffic volumes of Internet are constantly
growing, DPI will be faced with the high-performance challenges.
In high-speed routers, DPI is required to achieve packet processing
in line speed with limited memory space. First, string matching is
computationally intensive. DPI adopts string matching to compare
every byte of every incoming packet for a potential match again
hundreds of thousands of rules. Thus, string matching becomes
the performance bottleneck of DPI. For example, the routines of
Aho–Corasick string matching algorithm [8] account for about
70% of total execution time and 80% of instructions [9]. Since
software-based string matching algorithms can not keep up with
10–40 Gbps 40 Gbps packet processing, hardware-based string
matching algorithms [10–25] have been recently proposed to improve the throughput of DPI. These hardware-based algorithms
exploit modern embedded devices, such as Application Specific
Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA),
Network Processor (NP), and Ternary Content Addressable Memory
(TCAM), to perform high-speed packet processing, using their powerful parallelism and fast on-chip memory access.
Second, a large number of emerging threats result in an expanding rule set, which in turn leads to an explosion of memory
requirements. String matching typically uses the Deterministic
1786
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
Finite Automaton (DFA) to represent multiple signature strings for
all possible matches at one time. However, DFA requires large
memory space to encode ever increasing signatures, which cannot
fit in on-chip embedded memory with the size of only 10 Mb.
Hence, it is critical to achieve high-throughput string matching in
hardware, with small memory requirements.
To improve the throughput and reduce memory requirements
of DPI, Tan et al. [19] have proposed a bit-split string matching
algorithm, suitable for hardware implementation. In the bit-split
algorithm, an original DFA of Aho–Corasick algorithm is split apart
into a set of parallel tiny DFAs, each of which is responsible for a
portion of each signature string. However, the bit-split algorithm
suffers from the unnecessary state transitions problem. The root
cause lies in the fact that the bit-split algorithm makes pattern
matching in a ‘‘not seeing the forest for trees” approach. This
means that each tiny DFA only processes a b-bit substring of each
input character, but can’t check whether the entire character belongs to the alphabet of original DFA. This approach can result in
the unnecessary state transitions of tiny DFAs, ultimately consuming extra expensive CPU cycles and memory access bandwidth in
hardware. Moreover, since the alphabet contains a part of unique
characters, malicious attacks can utilize this exploit to overload
the bit-split algorithm for evading DPI.
To address above issues, this paper presents a byte-filtered
string matching algorithm to improve time efficiency of DPI. The
byte-filtered algorithm is based on the bit-split algorithm, where
Bloom filters are used to preprocess each character of every incoming packet to check whether the character belongs to the original
alphabet, before performing bit-split string matching. We theoretically and experimentally show that compared with the bit-split
algorithm, the byte-filtered algorithm significantly reduces the
unnecessary state transitions to improve the string matching time
by several times. Our contributions are as follows:
To the best of our knowledge, we first indicate that the bit-split
string matching algorithm poses the unnecessary state transitions problem.
We propose a byte-filtered string matching algorithm, which
uses Bloom filters to filter all characters that are not in the original alphabet, before performing bit-split string matching.
Experimental results on Snort rule set show that compared with
the bit-split algorithm, the byte-filtered algorithm reduces the
string matching time by up to 3.6 times, as well as 54% of state
transitions.
The rest of the paper is organized as follows. Section 2 briefly
introduces the bit-split algorithm, and its unnecessary state transitions problem is described in Section 3. In Section 4, we presents in
detailed the byte-filtered string matching algorithm. Section 5
gives experimental results and related work is introduced in Section 6. Finally, Section 7 concludes the paper.
The construction process of an original DFA is as follows. First,
each signature string adds states to the trie, starting at the root
and going to the end of the string. The trie has a branching factor
that is equal to the number of unique characters in an original
alphabet. Second, the trie is traversed and failure edges are added
to shortcut from a partial suffix match of one string to a partial prefix match of another. The string matching of the original DFA is
easy: given a current state and an input character, the DFA makes
a normal transition on the character to a next state, or makes a failure transition.
Fig. 1 depicts an example original DFA for a set of strings {he, she, his, hers}. In the DFA, the initial state is 0, and accepting states
are 2, 5, 7 and 9. As seen in Fig. 1, a real line denotes a normal transition, while a dashed line denotes a failure transition to the initial
state.
The bit-split algorithm has two stages to construct a set of tiny
alphabets and tiny DFAs. The first stage splits an original alphabet
of Aho–Corasick algorithm apart into a set of tiny alphabets. A character consists of m bytes (m P 1). For example, an English character
has one byte, while a Chinese character has two bytes. Each character
is divided into a number of equal-sized portions called substrings,
each with b bits. Thus, each substring of every unique character in
the original alphabet is extracted to construct a tiny alphabet. The
second stage splits an original DFA of Aho–Corasick algorithm apart
into a set of 8m/b tiny DFAs. For a given tiny state and a tuple of tiny
alphabet, we search for a subset of possible next original states, labeled as a next tiny state, to construct a tiny DFA. Note that a tiny
state may contain a subset of original states and an accepting tiny
state is set when it contains any accepting original state.
Fig. 2 depicts the original alphabet and two tiny alphabets for a
set of strings {he, she, his, hers}. In Fig. 2(a), the original alphabet is
R = {h, s, e, i, r}, and each unique character has one 8-bit byte. In
Fig. 2(b), the first and second bits are extracted to construct the left
tiny alphabet RB01, and the third and fourth bits are extracted to
construct the right tiny alphabet RB23. In a tiny alphabet, each tuple (row) consists of a 2-bit substring and a subset of unique characters. For instance, the second tuple in RB01 contains a substring
01 and a subset {e, i}.
Fig. 3 depicts two tiny DFAs for a set of strings {he, she, his, hers}.
The tiny DFA B01DFA corresponds to the tiny alphabet RB01, while
B23DFA corresponds to RB23. B01DFA has the initial tiny state 00 and
accepting tiny states 30 , 60 , 70 and 80 , while B23DFA has the initial
0
tiny state 0 and accepting tiny states 40 , 60 , 80 and 90 . Note that
0
the tiny state 6 in B23DFA has a subset of original states {2, 5}.
The matching process of bit-split algorithm is as follows. First,
an input character is divided into a set of b-bit substrings. Second,
each b-bit substring is extracted to feed to the corresponding tiny
0
h
2. Bit-split string matching overview
h
s
1
In essence, the bit-split string matching algorithm [19] is based
on Aho–Corasick algorithm [8] to construct a set of tiny alphabets
and tiny DFAs which process every input character in parallel. For a
given set of rules, Aho–Corasick algorithm constructs an original
alphabet and original DFA. To clarify this paper and distinguish
items, a state of the original DFA is called an original state, whereas
a state of the tiny DFA is called a tiny state.
Aho–Corasick algorithm [8] is a classic exact string matching
algorithm. The key of Aho–Corasick algorithm is to construct an
original DFA that encodes all signature strings to be searched for.
The original DFA is represented with a trie, which starts with an
empty root node that is the initial state.
he
h
s
h i
2
s
{his} 7
s
h
6
r {he}
8
h
s
s
3
s
h
s
i
h
s
s
4
e
{she,he} 5
r
h
{hers} 9
Fig. 1. Original DFA for a set {he, she, his, hers}.
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
Char
Decimal
7
6
5
4
3
2
1
0
h
104
0
1
1
0
1
0
0
0
s
115
0
1
1
1
0
0
1
1
e
101
0
1
1
0
0
1
0
1
i
105
0
1
1
0
1
0
0
1
r
114
0
1
1
1
0
0
1
0
sets of original states contained by next tiny states, and then the
corresponding matched strings are reported.
For example, we assume that an input string {rhe} is read by tiny
DFAs in Fig. 3. B01DFA reads a set of 2-bit substrings {10, 00, 01},
while B23DFA reads a set of 2-bit substrings {00, 10, 01}. In Fig. 3(a),
a serial of state transitions 00 ? 00 ? 10 ? 30 is made by B01DFA. At
the same time, in Fig. 3(b), another serial of state transitions 00 ?
10 ? 30 ? 60 is made by B23DFA. 00 and 10 are not accepting tiny state
in B01DFA, and 00 , 10 , and 30 are also not accepting tiny states in
B23DFA. Hence, the intersection of {2} of 30 in B01DFA and {2, 5} of
60 in B23DFA is taken to output the original state {2}, and then the
matched string {he} is reported.
(a) Original alphabet
B 01
B23
1
0
CharSet
3
2
CharSet
0
0
{h}
0
0
{s, r}
0
1
{e, i}
0
1
{e}
1
0
{r}
1
0
{h, i}
1
1
{s}
1
1
{}
3. Problem statement
(b) Tiny alphabets
Fig. 2. Original alphabet and two tiny alphabets for a set {he, she, his, hers}.
01,10
01,10
01,10
0'
10
01,10
00
n
11
c
11 01
00
2'
1'
00
10
11
01,10 11
00
00
{7} 6'
11
x
00
x
{2} 3'
5'
10
00
01
00 01
11
4'
01
{5} 7'
00
11
11
10
{9} 8'
(a) B01DFA
01,11
01,11
01,11
00
0'
00
1'
00
x
10
3'
00
{2,5} 6'
n
01,11
2'
01,11
10
10
10
x
01,11
11
11
5'
10
10
00
01
00 10
11
10
c
01,11 10
01
01
00
10
{2}
4'
00
8' {7}
7'
00
00
{9}
1787
9'
(b) B23DFA
Fig. 3. Two tiny DFAs for a set {he, she, his, hers}.
DFA, and then all tiny DFAs make transitions to next tiny states in
parallel. Finally, the actual output is an intersection of all the sub-
The bit-split string matching algorithm suffers from the unnecessary state transitions problem. This is because a substring of an
input character may belong to a tiny alphabet, even if the entire
character does not belong to an original alphabet. In the bit-split
algorithm, the 8m-bit string of each input character is divided into
a set of b-bit substring, which are feed to the corresponding tiny
DFAs for string matching. The 8m-bit string is called a forest, while
the b-bit substring is called a tree. Since each b-bit substring of
each unique character is extracted from an original alphabet R to
construct a tiny alphabet R0 = {0, 1, . . .2b 1}, the substring belongs
to R 0 , but the entire character may not belong to R.
For example, we assume that the 8-bit string of the input character c is 01100011. The first 2-bit substring is 11, and the second
2-bit substring is 00. As seen in Fig. 2, the two substrings belongs to
the corresponding tiny alphabets RB01 and RB23, whereas the entire string does not belong to the original alphabet R = {h, s, e, i, r}.
The bit-split algorithm makes pattern matching in a ‘‘not seeing the forest for trees” approach. This means that each tiny DFA
only processes a b-bit substring of each input character, but can’t
check whether the entire character belongs to the original alphabet. This abnormal approach results in the unnecessary state
transitions of tiny DFAs, which in turn consumes extra expensive
CPU cycles and memory access bandwidth in hardware, and ultimately limits the throughput of DPI. Note that Aho–Corasick
algorithm does not have the unnecessary state transitions problem, because there are failure transitions (seen in Fig. 1) to make
when reading any input character that is not in the original
alphabet.
We illustrate an example of unnecessary state transitions in
Fig. 3. When an input string is cxxn, B01DFA reads a set of 2-bit substrings {11, 00, 00, 10}, and B23DFA reads a set of 2-bit substrings
{00, 10, 10, 11}. A serial of state transitions 00 ? 20 ? 40 ? 10 ? 00
is made by B01DFA, depicted by dashed lines in Fig. 3(a). At the
same time, a serial of state transitions 00 ? 10 ? 30 ? 50 ? 00 is
made by B23DFA in Fig. 3(b). Each 2-bit substring of c, x, and n belongs to the corresponding tiny alphabet, such as RB01 and RB23.
But these characters do not belong to the original alphabet
R = {h, s, e, i, r}. Consequently, both B01DFA and B23DFA together
make eight unnecessary state transitions, which incurs extra memory access.
To understand above issues, we investigate the statistical distribution of unique characters in real network environments. The corpus of 8-bit characters has 28 = 256 unique characters. Thus, the
real original alphabet is a subset of the corpus. Fig. 4(a) depicts
the unique characters statistics in Snort 2.7 [26], and Fig. 4(b) depicts 55 unique characters statistics in week second network traffic
of MIT DARPA 1999 evaluation set [27].
In Snort 2.7, there are about 50 types of attacks and total 7840
signature rules. In Fig. 4(a), the x-axis is the number of signature
rules for every five types of attacks, and the y-axis is the number
of unique characters. Fig. 4(a) shows that the total 7840 signature
1788
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
Number of Unique Characters
250
201
200
189
174
156
148
145
150
132
138
96
100
89
81
50
0
278
230
128
951 4423 178
753
131
699
69
7840
Number of Signature Rules
(a) Unique characters statistics in
Snort 2.7
.12
DARPA 1999 2nd Week Traffic
.10
Max=11.4%
CDF
.08
input character is not in the original alphabet, each tiny DFA checks
to see whether its current state is the initial state before making a
state transition. If not, the tiny DFA makes a transition to the initial
state. Otherwise, the tiny DFA stops any state transition. The main
purpose of the optimization scheme is to further avoid the unnecessary state transitions in hardware to improve the throughput of
DPI.
In the byte-filtered algorithm, we use Bloom filters as the byte
filter to preprocess every input character. In real applications, a
character consists of one or multiple bytes. The Unicode [28] typically uses two bytes to provide a unique number for every character, no matter what the platform, no matter what the program,
no matter what the language. As a character has two bytes, the
corpus has 65536 unique characters. When a bit vector with
the size of 65536 is used to represent all the Unicode characters,
the bit vector consumes a large account of memory with
64 Kbits, which is not suitable for costly small on-chip memory
in hardware. Hence, since Bloom filters are a space-efficient data
structure for fast membership queries, our byte filter adopts
Bloom filters to quickly check each input character of multiple
bytes.
In the remainder of this section, we first briefly introduce Bloom
filters, and then present the architecture of the byte-filtered string
matching engine. Finally, we analyze the performance gains of the
byte-filtered algorithm.
.06
4.1. Bloom filters overview
.04
Bloom filters [29] are a simple space-efficient data structure for
fast approximate set-membership queries. Bloom filters are widely
used in most network applications from Web caching to P2P collaboration to packet processing [30].
Bloom filters represent a set S of n items by an array V of m bits,
initially all set to 0. Bloom filters use k independent hash functions
h1, h2, . . ., hk with the range [1, . . ., m]. For each item s 2 S, the bits
V[hi(s)] are set to 1, for 1 6 i 6 k. To query if an item u is in S, we
check whether all bits V[hi(u)] are set to 1. If not, u is not in S. If
all bits V[hi(u)] are set to 1, it is assumed that u is in S.
Fig. 5 illustrates an example of Bloom filters. In Fig. 5, there is a
bit vector with the size of m = 12, and k = 3 hash functions. As seen
in Fig. 5(a), item s1 is hashing the bit locations {2, 5, 9}, and item s2
is hashing the bit locations {5, 7, 9}. Thus, the bit locations {2, 5, 9}
and {5, 7, 9} are set to 1. As seen in Fig. 5(b), to query items u1
and u2, we hash u1 and u2 to the bit locations {2, 5, 9} and
{5, 8, 10}, respectively. Since the bit locations {2, 5, 9} are all set to
1, u1 is in the set. But, u2 is not in the set, because the bit locations
{5, 8, 10} are not all set to 1.
Bloom filters may yield a false positive, where they suggest that
an item s is in the set even though it is not. The false positive comes
from the fact that b bits in the bit vector may be set by any of items.
.02
0.00
0
5
10
15
20
25
30
35
40
45
Number of Unique Character Out of Alphabet
50
55
(b) 55 unique characters statistics in
week 2nd network traffic
Fig. 4. Unique characters statistics in real network applications.
rules contain 201 unique characters, so that there remain 55 unique characters not contained by Snort rule set.
In MIT DARPA 1999 evaluation set, the week second network
traffic includes normal background traffic and labeled attack traffic. Fig. 4(b) shows that 11.4% of total network packets contain
the remaining 55 unique characters that are not in the original
alphabet of Snort. Hence, malicious attackers can utilize this exploit to insert a large number of out-of-original-alphabet characters into packet payloads, e.g. 100 input strings cxxn, which
results in frequent unnecessary state transitions of tiny DFAs to
overload the bit-split algorithm for evading inspection.
4. Byte-filtered string matching
In this section, we present a byte-filtered string matching algorithm to overcome the unnecessary state transitions problem. The
core idea of the byte-filtered algorithm is that each character of
every incoming packet is filtered by checking whether the character belongs to the original alphabet, before performing string
matching. If the character is in the original alphabet, the character
is divided into a set of substrings, and all tiny DFAs make state
transitions on the substrings in parallel for bit-split string matching. If not, each tiny DFA either makes a failure transition or stops
any state transition.
Furthermore, we propose a novel optimization scheme to
speedup the byte-filtered algorithm. In the scheme, given that an
(a) Construction process
(b) Query process
Fig. 5. An example of Bloom filters.
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
Fortunately, Bloom filters have no any false negative. The false positive probability is computed as follows:
f ¼ ð1 ekn=m Þk
ð1Þ
where n is the size of S, m is the size of V, and k is the number of
hash functions. For a given n, we choose appropriate values of m
and k to reduce the false positives. The false positive probability
is minimized with respect to k = ln 2m/n. The corresponding minimal probability of false positive is as follows:
f ¼ ð1=2Þk ¼ ð0:6185Þm=n
ð2Þ
Despite all that, the space savings of Bloom filters often outweigh
the false positive, when the false positive probability is made sufficiently small.
The false positive of Bloom filters cannot impact the correctness
of our byte-filtered string matching algorithm. When an input
character generates a false positive, the character is divided into
a set of equal-sized substrings, each of which is feed to the corresponding tiny DFA to make a state transition. Since the input character is not in the original alphabet, the intersection of all the
original state subsets contained by next tiny states is empty, which
means no accepting original state. Hence, the false positive of
Bloom filters incurs only a little more state transitions, but can
not make a wrong matching. Moreover, we can choose appropriate
parameters to minimize the false positive probability of the bytefiltered algorithm, which further reduces the unnecessary state
transitions of tiny DFAs.
4.2. Byte-filtered string matching engine
In this section, we present a novel architecture of byte-filtered
string matching engine, suitable for hardware implementation
such as FPGA and ASIC. We also describe the optimization schemes
for hardware design and implementation of the engine.
Fig. 6 depicts a top-level diagram of byte-filtered string matching engine architecture. First, the byte-filtered engine reads as input a data stream with the window size w. Second, each input
character of multiple bytes is transferred to the byte filter which
checks to see whether the character is in the original alphabet
w
...
...
Streaming Data Window
9
8 7
6
5
4
3
2
1
Input Bytes
Byte Filter
Split Bits
Leaving
Bytes
per clock cycle. If not, four tiny DFAs including B01DFA, B23DFA,
B45DFA, and B67DFA, either make a transition to the initial tiny state
or stop any state transition. If the input character is in the original
alphabet, multiple bytes of the character are divided into four
parts, each of which is feed to the corresponding tiny DFA. Four
tiny DFAs output the partial match vectors including PMV0, PMV1,
PMV2, and PMV3, which are transferred to the logical bitwise AND
unit to take an intersection. Finally, the full match vector is outputted and the matched signatures are reported to system
administrators.
Furthermore, we can optimize the hardware design and implementation of byte-filtered string matching engine like previous
work [19,31]. The main purpose of optimization is to further
speedup high-speed packet content filtering.
(1) For a given character, our byte-filtered string matching
engine implements four tiny DFAs on parallel processing
units in hardware. Theoretical analysis of bit-split string
matching algorithm [19] shows an 8-bit character is divided
into four 2-bit substrings, which means that there are four
parallel tiny DFAs, to minimize memory requirements of
the bit-split algorithm. Thus, the byte-filtered algorithm
uses the same parameters with the bit-split algorithm to
achieve space efficiency.
(2) For the byte filter that uses Bloom filters, we choose appropriate parameters to speedup the byte-filtered algorithm.
Bloom filters adopt a class universal hash function called
H3 [32], which is suitable for hardware implementation.
Any item X is represented as X = {x1, x2, . . ., xb}. The ith hash
function hi on X is calculated by using the formula hi(X) =
(di1 x1) (di2 x2) (dib xb), where is a bitwise
AND operator, is a bitwise XOR operator, and dij is a predefined random number in the range [1, . . ., m]. We choose the
number of H3 hash functions as bk = ln 2(m/n)c to minimize
the false positive probability, which achieves time efficiency.
4.3. Analysis
In this section, we analyze and compare the byte-filtered algorithm with the bit-split algorithm in terms of the number of state
transitions.
We assume that there is an incoming string of n characters, and
the string has the percent q of characters that are not in the original
alphabet. Each character has 8m bits, and the byte filter uses Bloom
filters with the false positive probability f. When each input character is divided into a set of b-bit substrings, there are 8m/b tiny
DFAs. For the string of n characters, the bit-split algorithm makes
the state transitions of Nbs = 8mn/b, while the byte-filtered algorithm makes the state transitions as follows:
Nbf ¼ ð1 qÞ 8m n=b þ f q 8m n=b þ 8m d=b
PMV0
PMV1 PMV2
B67 Tiny DFA
B45 Tiny DFA
B23 Tiny DFA
¼ 8m n=b ð1 f Þ q 8m n=b þ 8m d=b
B01 Tiny DFA
Entering
Bytes
1789
PMV3
AND
Full Match Vector
Fig. 6. Byte-filtered string matching engine architecture.
ð3Þ
where d is the number of groups of continuous characters that are
not in the original alphabet.
Compared with the bit-split algorithm, the bye-filtered algorithm achieves the performance gain in terms of the number of
state transitions as follows:
Gain ¼
Nbs Nbf ð1 f Þ q n d
¼
n
Nbs
ð4Þ
Formula (4) denotes that the performance gain we can achieve relies on the false positive probability of Bloom filters, and the percent
and number of groups of input characters that are not in the original
alphabet.
1790
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
When there is one group of continuous characters that are not
in the original alphabet, the byte-filtered algorithm achieves the
maximal performance gain as follows:
Gainmax ¼
ð1 f Þ q n 1
ð1 f Þ q ð1 ð1=2Þk Þ q
n
ð5Þ
Fig. 7 depicts the maximal performance gain of the byte-filtered
algorithm. In Fig. 7, k is the number of hash functions, and q is
the percent of characters that are not in the original. When k = 10
and q = 0.1 0.9, the maximal performance gain varies from 10%
to 89.9%. Fig. 7 also shows that when k = 2 10 and q = 0.1 0.9,
the maximal performance gain varies from 7.5% to 89.9%.
On the other hand, we compute the maximal number of groups
of continuous characters that are not in the original alphabet as
follows:
(
dmax ¼
1
1 ð1 f Þ q n; if q P 2ð1f
Þ
ð6Þ
1
if 0 6 q < 2ð1f
Þ
ð1 f Þ q n;
Accordingly, for a given dmax, the byte-filtered algorithm achieves
the minimal performance gain as follows:
Gainmin
8 2ð1f Þq1
1
;
if q P 2ð1f
>
Þ
n
>
<
¼ 2ð1ð1=2Þk Þq1
n
>
>
:
1
0;
if 0 6 q < 2ð1f
Þ
ð7Þ
We conduct extensive experiments in three scenarios including
the 1-bit split, 2-bit split, and 4-bit-split settings. In both the bitsplit and byte-filtered algorithms, the b-bit split scenario denotes
that each input character of 8m bits is divided into a set of b-bit
substrings, and there are 8m/b tiny DFAs. In three scenarios, we
set the constant number of hash functions as k = 10 to minimize
the false positives of Bloom filters, which eliminates the unnecessary state transitions of tiny DFAs.
5.1. On synthetic rule set
We developed a string stream generator with C/C++ to synthesize the evaluation set, including the signature and testing strings.
The generator randomly generates 20 100 signature strings, and
1000 testing strings from an ASCII character subset {‘0’ ‘9’,
‘a’ ‘z’, ‘A’ ‘Z’}. Each signature string has 10 bytes, and each testing string has 1000 bytes. The alphabet size of signature strings
varies from 10 to 40, while the alphabet of testing strings has 42
unique characters that are randomly chosen from the specified
character subset.
Fig. 8 depicts the string matching time under variable number
of signature rules. In Fig. 8, the alphabet of signature strings has
20 unique characters. Fig. 8(a) compares the matching time of
the byte-filtered algorithm with that of the bit-split algorithm,
and Fig. 8(b) gives the matching time ratio of the bit-split algorithm to the byte-filtered algorithm.
5. Experimental evaluation
1e+5
1-Bit Split Scenario (Bit-Split)
1-Bit Split Scenario (Byte-Filtered)
2-Bit Split Scenario (Bit-Split)
2-Bit Split Scenario (Byte-Filtered)
4-Bit Split Scenario (Bit-Split)
4-Bit Split Scenario (Byte-Filtered)
8e+4
Match Unit Times
In this section, we present the software simulation experiments
on synthetic and real signature rule sets to compare the byte-filtered algorithm with the algorithm. In the experiments, we measure the performance metrics including the string matching time
and the number of state transitions.
We designed and implemented both the bit-split and byte-filtered algorithms with C/C++, and run these programs on a machine
with 3 GHz Intel Pentium M CPU and 512 MB main memory. We
synthesize or choose a set of signature strings, and generate a set
of testing strings to evaluate the performance of string matching.
The simulation process is as follows. First, the simulator converts
the set of signature strings into one DFA, and then the DFA is split
into a set of tiny DFAs for both the bit-split and byte-filtered algorithms. Second, the simulator feeds the set of testing strings to all
the tiny DFAs which reside only in main memory at the matching
phase. Note that the simulator clicks the CPU clock and counts the
number of state transitions of tiny DFAs.
6e+4
4e+4
2e+4
0
20
80
100
18
Matching Time Ratio
Gain
60
Number of Rules
(a) Matching time
1-Bit Split Scenario
2-Bit Split Scenario
4-Bit Split Scenario
16
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.9
40
14
12
10
8
6
4
2
0.8
0.7
0.6
q
0.5
0.4
0.3
0.2
0.1
2
3
4
5
6
7
8
k
Fig. 7. Maximal performance gain of the byte-filtered algorithm.
9
10
0
20
40
60
80
100
Number of Rules
(b) Matching time ratio
Fig. 8. String matching time under variable number of signature rules.
1791
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
Number of State Transfers
1e+7
8e+6
Bit-Split
Byte-Filtered
8000000
Number of State Transistions
5e+6
4e+6
3495892
3185172
3e+6
2246884
2e+6
1586343
1e+6
0
10
30
40
Fig. 10. Number of state transitions under variable number of characters in the
alphabet.
6e+4
10 Characters(Bit-Split)
10 Characters(Byte-Filtered)
20 Characters(Bit-Split)
20 Characters(Byte-Filtered)
30 Characters(Bit-Split)
30 Characters(Byte-Filtered)
40 Characters(Bit-Split)
40 Characters(Byte-Filtered)
Matching Unit Times
5e+4
4e+4
3e+4
2e+4
1e+4
0
20
40
80
Number of Rules
(a) Matching time
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
10 Characters
20 Characters
30 Characters
40 Characters
20
6e+6
20
Number of Characters
Matching Time Ratio
Fig. 8(a) shows that when the number of signature rules increases, the matching time of the bit-split algorithm increases
quickly, while that of the byte-filtered algorithm increases slowly.
For example, when the number of signature rules increases from
20 to 100 in the 1-bit split scenario, the matching time of the
bit-split algorithm increases from 21307 to 81066 unit times by
a factor of 3.8, while that of the byte-filtered algorithm increases
from 2813 to 5195 unit times only by a factor of 1.8.
Fig. 8(b) shows that when the number of signature rules increases, the matching time ratio increases, and when the b-bits
split increase, the ratio decreases. For example, when the number
of signature rules increases from 20 to 100, the matching time ratio
increases from 7.6 to 15.6 in the 1-bit split scenario, while the ratio
increases only from 2.6 to 4.4 in the 4-bit split scenario. Fig. 8(b)
also shows that compared with the bit-split algorithm, the byte-filtered algorithm reduces the string matching time by 2.6 15.6
times.
Fig. 9 depicts the number of state transitions under variable
number of signature rules. As seen in Fig. 9, the number of state
transitions decreases as the b-bits split increase. Since there are
8m/b tiny DFAs, the number of state transitions is proportional to
the number of parallel tiny DFAs. Fig. 9 shows that compared with
the bit-split algorithm, the byte-filtered algorithm reduces about
44% state transitions of tiny DFAs.
Furthermore, we do experiments to explore how the number of
characters in the alphabet impacts the string matching time. In the
experiments, the alphabet has from 10 to 40 unique characters,
which are randomly chosen from the corpus of 62 specified characters, and the bits split are b = 2. Figs. 10 and 11 depict the number
of state transitions and the string matching time under variable
number of characters in the alphabet respectively.
As there are four tiny DFAs in the 2-bit split scenario, when
1000,000 bytes are read, there are totally 4000,000 state transitions of the bit-split algorithm. Fig. 10 shows that when the number of characters grows, the number of state transitions of the
byte-filtered algorithm approaches to that of the bit-split
algorithm.
Fig. 11(a) shows that when the number of characters increases,
the string matching time increases. For example, when there are 80
signature rules and the number of characters increases from 10 to
40, the string matching time of the bit-split algorithm increases
from 37504 to 49816 unit times, while that of the byte-filtered
algorithm increases from 2598 to 13375 unit times. Fig. 11(a) also
shows that compared with the bit-split algorithm, the byte-filtered
algorithm has slower increase of string matching time.
Fig. 11(b) shows that when the number of characters grows, the
string time ratio decreases. For example, when the number of char-
40
80
Number of Rules
4493768
4e+6
(b) Matching time ratio
4000000
Fig. 11. String matching time under variable number of characters in the alphabet.
2246884
2e+6
2000000
1123442
0
1-Bit Split Scenario 2-Bit Split Scenario 4-Bit Split Scenario
Fig. 9. Number of state transitions under variable number of signature rules.
acters increases from 10 to 40 and there are 80 signature rules, the
string time ratio decreases from 14.4 to 3.7. Fig. 11(b) also shows
that when there are 10, 20, 30, and 40 unique characters in the
alphabet, the string time ratio increases from 7.5 to 14.4, from
4.5 to 8.3, from 3.3 to 5.2, and from 2.7 to 3.7, respectively.
1792
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
5.2. On real rule set
1.2e+5
Bit-Split
Byte-Filtered
Matching Unit Times
1.0e+5
8.0e+4
6.0e+4
4.0e+4
2.0e+4
scan
snmp
icmp
finger
dos
dns
p2p
telnet
porn
ddos
web-iis
pop3
chat
shellcode
voip
sql
web-cgi
ftp
smtp
rpc
0.0
(a) String matching time
4.0
3.6
3.5
Matching Time Ratio
3.4
3.3
3.2
3.0
3.0
2.9
2.9
2.5
2.6
2.5
2.3
2.4
2.4
2.3
2.2
2.3
2.2
2.2
2.0
2.0
1.9
1.6
1.5
rpc
ftp
smtp
sql
web-cgi
voip
chat
shellcode
pop3
web-iis
porn
ddos
p2p
telnet
dns
dos
icmp
finger
scan
1.0
snmp
We use a set of Snort signature rules to compare the performance between the bit-split and byte-filtered algorithms. In Snort
2.7, there are 50 types of attacks in the rule set. We choose first 20
types of attacks from Scan to RPC, each of which has less than 100
signature rules. This is because the DFA construction has the high
time/space complexity. When all the 50 types of attacks are complied, it is difficult to construct one original DFA and a set of tiny
DFAs due to the explosive memory requirements. In practice, Snort
often partitions all the signature rules into a large number of
groups in terms of the source and destination ports, which facilitates to construct separate small DFAs. Hence, the 20 types of attacks are enough to reflect real network application requirements.
Fig. 12 depicts the statistics of 20 types of attacks in Snort 2.7
rule set. In Fig. 12, the number of signature rules varies from 9 to
72. In this experiment, we synthesize 1000 testing strings, each
with the size of 1000 bytes, to examine the performance of string
matching in the 2-bit split scenario. We set the optimal parameters
of both the bit-split and byte-filtered algorithms to minimize
memory requirements and achieve time efficiency.
Fig. 13 depicts the string matching time on Snort 2.7 rule set.
Fig. 13(a) compares the matching time between the bit-split and
byte-filtered algorithms, and Fig. 13(b) gives the matching time ratio of the bit-split algorithm to the byte-filtered algorithm.
As seen in Fig. 13(a), on 24 signature rules of DDoS, the bit-split
algorithm consumes 48055 unit times, while the byte-filtered
algorithm only consumes 16758 unit times. Fig. 13(b) shows that
compared with the bit-split algorithm, the byte-filtered algorithm
reduces the string matching time by 1.6 3.6 times on 20 types of
attacks in the 2-bit split scenario.
Fig. 14 depicts the unnecessary state transitions percent on
Snort 2.7 rule set. As seen in Fig. 14, on 24 signature rules of DDoS,
the byte-filtered algorithm reduces the unnecessary state transitions percent by about 33% of total state transitions. Fig. 14 shows
that compared with the bit-split algorithm, the byte-filtered algorithm reduces 7 54% state transitions of tiny DFAs on 20 types of
attacks in the 2-bit split scenario.
(b) Matching time ratio
Fig. 13. String matching time on Snort 2.7 rule set.
6. Related work
9 9
0.09
0.07
0.09
rpc
ftp
smtp
sql
voip
chat
0.0
pop3
16
.1
0.14
web-cgi
14
11 13
26
0.27
0.18
0.17
shellcode
10
24
0.16
web-iis
20
21
19 20
0.19
porn
30
.2
ddos
30
33
36 36
0.25 0.23
0.20
p2p
40
0.30
0.27
telnet
50
0.33
dns
56
0.34
.3
dos
54
0.40
finger
60
0.42
icmp
64
Fig. 14. Unnecessary state transitions percent on Snort 2.7 rule set.
0
scan
snmp
icmp
finger
dos
dns
p2p
telnet
porn
ddos
web-iis
pop3
chat
shellcode
voip
sql
web-cgi
ftp
smtp
rpc
Number ofUnique Payloads
68
0.52
.4
scan
72
70
0.54
.5
snmp
80
.6
Unnecessary State Transition Percent
String matching is one of the well studied classic problems in
computer science. String matching has been widely used in a range
of applications, such as information retrieval, pattern recognition,
Fig. 12. 20 types of attacks in Snort 2.7 rule set.
and network security. String matching algorithms can be classified
into single- and multi-pattern algorithms.
In the single-pattern string matching algorithm, a single-pattern string is to be searched for the text at a time. This means that
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
if we have s patterns to be searched, the algorithm must be repeated by s times. Knuth–Morris–Pratt [33] and Boyer–Moore
[34] are some of the most widely used single-pattern string matching algorithms.
In the multi-pattern algorithm, a set of strings are to be
searched for the text at a time. This is achieved by converting the
set of strings into one DFA. Typical multi-pattern string matching
algorithms include Aho–Corasick [8], Commentz–Walter [35],
and Aho–Corasick Boyer–Moore [36].
These string matching algorithms are suitable for software
implementation, but cannot keep up with high-speed packet processing. Thus, hardware-based string matching algorithms [10–
14] have been recently proposed to improve the throughput of
DPI by using modern embedded devices.
For instance, FPGA-based string matching algorithms map a given rule set and the corresponding state automata down to a
reconfigurable and highly parallel hardware circuit by using gates
and flip-flops. The FPGA-based solution [11,13,18] can achieve the
throughput of about 10 Gbps, but require the frequent time-consuming recompilation and reconfiguration of hardware circuit.
The ASIC-based solution [19,20] is proposed by compiling the state
automata into the silicon die. The solution can achieve about
20 Gbps throughput, but cannot support incremental updates.
The TCAM-based solution [17,21,22] is proposed by exploiting
multiple memory accesses in one I/O time cycle. The TCAM-based
solution can achieve the high throughput of above 20 Gbps, but
suffers from the expensive cost and high power consumption,
which prevents their widespread applications.
It is challenging for hardware-based string matching algorithms
to keep up with high-speed packet processing with limited on-chip
memory. In recent years, many memory-efficient string matching
algorithms have been proposed to reduce the memory requirements. Tuck et al. [15] propose the bitmap compression and path
compression schemes to reduce the amount of memory requirements of Aho–Corasick algorithm and enhance the worst case performance. Dharmapurikar and Lockwood [18] modify Aho–
Corasick algorithm to scan multiple characters at a time, and suppressed the unnecessary memory accesses using Bloom filters. Lu
et al. [21] implement a memory-efficient multiple-character parallel DFA using Binary Content Addressable Memory. Hua et al. [25]
propose a variable-stride DFA using the winnowing scheme to
speedup the string matching. Piyachon et al. [23] describe a memory model for parallel Aho–Corasick algorithms and find the optimal multi-byte bit-split string matching algorithm with the
minimal memory requirements on network processors. Lunteren
[20] and Song et al. [24] propose the B-FSM and Cached DFA to
compress the state transitions of DFA. Dharmapurikar et al. [11]
propose parallel Bloom filters to handle a set of signature strings
with different lengths, in order to achieving the line-speed string
matching with FPGA implementation.
The expensive power and flexibility of regular expressions have
been exploited in NIDS/NIPS. Many hardware-based regular
expression matching algorithms have been proposed by recent literatures. Kumar et al. [37,38] propose a delayed input DFA by using
default transitions to significantly reduce the memory requirements of DFA. Yu et al. [39] propose an efficient algorithm for partitioning a large set of regular expressions into multiple groups,
reducing the overall memory space of DFA. Becchi and Cadambi
[40] propose a novel algorithm to merge non-equivalent states of
DFA using transition labeling, in order to achieve significant memory savings and provide worst-case speed guarantees. Smith et al.
[41,42] propose an extended finite automaton (XFA) using auxiliary variables and simple instructions per state to reduce the
explosive memory requirements of DFA.
The key of the byte-filtered string matching algorithm is to
eliminate the unnecessary state transitions using Bloom filters.
1793
Besides in the bit-split algorithm, the byte filter scheme is applied
in above proposed hardware-based to further improve the
throughput of string matching.
7. Conclusions
Existing bit-split string matching algorithm suffers from the
unnecessary state transitions problem. The bit-split algorithm
makes pattern matching in a ‘‘not seeing the forest for trees” approach, where each tiny DFA only processes a substring of each input character, but cannot check whether the entire character
belongs to the original alphabet. This paper proposes a byte-filtered string matching algorithm using Bloom filters. The byte-filtered algorithm preprocesses each character of every incoming
packet by checking whether the character belongs to the original
alphabet, before performing bit-split string matching. We further
optimize the byte-filtered algorithm for hardware implementation,
and theoretically analyze the performance gain in terms of the
number of state transitions.
Experimental results on synthetic rule set show that compared
with the bit-split algorithm, the byte-filtered algorithm reduces
the string matching time by up to 15.6 times and the number of
state transitions of tiny DFAs by about 44%; when the alphabet size
increases, the string matching time ratio of the bye-filtered algorithm decreases. Experimental results on Snort rule set demonstrate that compared with the bit-split algorithm, the bytefiltered algorithm reduces the string matching time by up to 3.6
times as well as the number of state transitions by up to 54%.
Acknowledgment
This work is supported by the National Science Foundation of
China under Grant No. 90718008, and the National Basic Research
Program of China (973) with Grant No. 2007CB310702.
References
[1] S. Staniford, V. Paxson, N. Weaver, How to own the Internet in your spare time,
in: Proceedings of the 11th USENIX Security Symposium, 2002, pp.149–167.
[2] S. Singh, C. Estan, G. Varghese, S. Savage, Automated worm fingerprinting, in:
Proceedings of the Sixth ACM/USENIX Symposium on Operating System Design
and Implementation, 2004, pp. 45–60.
[3] Frost & Sullivan, World intrusion detection and prevention systems markets,
2007.
[4] M. Roesch, Snort – Lightweight intrusion detection for networks, in:
Proceedings of the 13th Conference on Systems Administration (LISA), 1999,
pp. 229–238.
[5] V. Paxon, Bro: a system for detecting network intruders in real-time, Computer
Networks 31 (23–24) (1999) 2435–2463.
[6] Application layer packet classifier for Linux, 2008. <http://l7-filter.sourceforge.
net>.
[7] S. Sen, O. Spatscheck, D. Wang, Accurate, scalable in-network identification of
P2P traffic using application signatures, in: Proceedings of WWW 2004,
Manhattan, 2004.
[8] A.V. Aho, M.J. Corasick, Efficient string matching: an aid to bibliographic
search, Communications of the ACM 18 (6) (1975) 333–340.
[9] J.B.D Cabrera, J. Gosar, W. Lee, R.K. Mehra, On the statistical distribution of
processing times in network intrusion detection, in: Proceedings of the 43rd
IEEE Conference on Decision and Control, 2004, pp. 75–80.
[10] I. Sourdis, D. Pnevmatikatos, Pre-decoded CAMs for efficient and high-speed
NIDS pattern matching, in: Proceedings of IEEE FCCM, 2004.
[11] Z.K. Baker, V.K. Prasanna, Time and area efficient pattern matching on FPGAs,
in: Proceedings of FPGA, 2004.
[12] Z.K. Baker, V.K. Prasanna, A methodology for synthesis of efficient intrusion
detection systems on FPGAs, in: Proceedings of IEEE FCCM, 2004.
[13] C.R. Clark, D.E. Schimmel, Scalable pattern matching on high-speed networks,
in: Proceedings of IEEE FCCM, 2004.
[14] S. Yusuf, W. Luk, Bitwise optimized CAM for network intrusion detection
systems, in: Proceedings of IEEE FPL, 2005.
[15] N. Tuck, T. Sherwood, B. Calder, G. Varghese, Deterministic memory-efficient
string matching algorithms for intrusion detection, in: Proceedings of IEEE
INFOCOM, 2004.
1794
K. Huang et al. / Computer Communications 33 (2010) 1785–1794
[16] S. Dharmapurikar, P. Krishnamurthy, T.S. Sproull, J.W. Lockwood, Deep
packet inspection using parallel Bloom filters, IEEE Micro 24 (1) (2004) 52–
61.
[17] F. Yu, R. Katz, T.V. Lakshman, Gigabit rate packet pattern-matching using
TCAM, in: Proceedings of IEEE ICNP, 2004.
[18] S. Dharmapurikar, J. Lockwood, Fast and scalable pattern matching for content
filtering, in: Proceedings of IEEE/ACM ANCS, 2005.
[19] L. Tan, B. Brotherton, T. Sherwood, Bit-split string-matching engines for
intrusion detection and prevention, ACM Transactions on Architecture and
Code Optimization 3 (1) (2006) 3–34.
[20] J. van Lunteren, High performance pattern-matching for intrusion detection,
in: Proceedings of IEEE INFOCOM, 2006.
[21] H. Lu, K. Zheng, B. Liu, X. Zhang, Y. Liu, A memory-efficient parallel string
matching architecture for high-speed intrusion detection, IEEE Journal on
Selected Areas in Communication 34 (10) (2006) 1793–1804.
[22] M. Alicherry, M. Muthuprasanna, V. Kumar, High speed pattern matching for
network IDS/IPS, in: Proceedings of IEEE ICNP, 2006.
[23] P. Piyachon, Y. Luo, Efficient memory utilization on network processors for
deep packet inspection, in: Proceedings of IEEE/ACM ANCS, 2006.
[24] T. Song, W. Zhang, D. Wang, A memory efficient multiple pattern matching
architecture for network security, in: Proceedings of IEEE INFOCOM, 2008.
[25] N. Hua, H. Song, T.V. Lakshman, Variable-stride multi-pattern matching for
scalable deep packet inspection, in: Proceedings of IEEE INFOCOM, 2009.
[26] Snort. Available from: <http://www.snort.org>.
[27] 1999 MIT/LL Evaluation Set. Available from: <http://www.ll.mit.edu/mission/
communications/ist/corpora/ideval/data/1999data.html>.
[28] Unicode. Available from: <http://www.unicode.org>.
[29] B. Bloom, Space/time tradeoffs in hash coding with allowable errors,
Communication of the ACM 13 (7) (1970) 422–426.
[30] A. Broder, M. Mitzenmacher, Network applications of Bloom filters: a survey,
Internet Mathematics 1 (4) (2004) 485–509.
[31] H. Song, S. Dharmapurikar, J. Turner, J. Lockwood, Fast hash table lookup using
extended bloom filter: an aid to network processing, in: Proceedings of ACM
SIGCOMM, 2005, pp. 181–192.
[32] M.V. Ramakrishna, E. Fu, E. Bahcekapili, Efficient hardware hashing functions
for high performance computers, IEEE Transactions on Computers 46 (12)
(1997) 1378–1381.
[33] D.E. Knuth, J.H. Morris, V.R. Pratt, Fast pattern matching in strings, SIAM
Journal on Computing 6 (2) (1977) 323–350.
[34] R.S. Boyer, J.S. Moore, A fast string searching algorithm, Communications of the
ACM 20 (10) (1977) 762–772.
[35] B. Commentz-Walter, A string matching algorithm fast on the average, in:
Proceedings of the Sixth Colloquium on Automata, Languages and
Programming, 1979.
[36] C. Coit, S. Staniford, J. McAlerney, Towards faster string matching for intrusion
detection, in: Proceedings of DARPA Information Survivability Conference and
Exhibition, 2002.
[37] S. Kumar, S. Dharmapurikar, F. Yu, P. Crowly, J. Turner, Algorithms to
accelerate multiple regular expressions matching for deep packet inspection,
in: Proceedings of ACM SIGCOMM, 2006.
[38] S. Kumar, J. Turner, J. Williams, Advanced algorithms for fast and scalable deep
packet inspection, in: Proceedings of IEEE/ACM ANCS, 2006.
[39] F. Yu, Z. Chen, Y. Diao, T.V. Lakshman, R.H. Katz, Fast and memory-efficient
regular expression matching for deep packet inspection, in: Proceedings of
ACM/IEEE ANCS, 2006.
[40] M. Becchi, S. Cadambi, Memory-efficient regular expression search using state
merging, in: Proceedings of IEEE INFOCOM, 2007.
[41] R. Smith, C. Estan, S. Jha, XFA: faster signature matching with extened
automata, in: Proceedings of IEEE Symposium on Security and Privacy, 2008.
[42] R. Smith, C. Estan, S. Jha, S. Kong, Deflating the big bang: fast and scalable deep
packet inspection with extended finite automata, in: Proceedings of ACM
SIGCOMM, 2008.