Deep packet inspection (DPI) is an advanced method of examining and managing network traffic. It is a form of packet filtering that locates, identifies, classifies, reroutes or blocks packets with specific data or code payloads that conventional packet filtering, which examines only packet headers, cannot detect.
DPI combines the functionality of an intrusion detection system (IDS) and an Intrusion prevention system (IPS) with a traditional stateful firewall. This combination makes it possible to detect certain attacks that neither the IDS/IPS nor the stateful firewall can catch on their own. Stateful firewalls, while able to see the beginning and end of a packet flow, cannot catch events on their own that would be out of bounds for a particular application. While IDSs are able to detect intrusions, they have very little capability in blocking such an attack. DPIs are used to prevent attacks from viruses and worms at wire speeds. More specifically, DPI can be effective against buffer overflow attacks, denial-of-service attacks (DoS), sophisticated intrusions, and a small percentage of worms that fit within a single packet.
DPI-enabled devices have the ability to look at Layer 2 and beyond Layer 3 of the OSI model. In some cases, DPI can be invoked to look through Layer 2-7 of the OSI model. This includes headers and data protocol structures as well as the payload of the message. DPI functionality is invoked when a device looks or takes other action, based on information beyond Layer 3 of the OSI model. DPI can identify and classify traffic based on a signature database that includes information extracted from the data part of a packet, allowing finer control than classification based only on header information. Endpoints can utilize encryption and obfuscation techniques to evade DPI actions in many cases.
A classified packet may be redirected, marked/tagged (see the quality of service), blocked, rate limited, and of course, reported to a reporting agent in the network. In this way, HTTP errors of different classifications may be identified and forwarded for analysis. Many DPI devices can identify packet flows (rather than packet-by-packet analysis), allowing control actions based on accumulated flow information.
TODAY'S TRAFFIC AND DPI
Today’s traffic and high-speed 100Gb links put severe pressure on vital security tools like Deep packet Inspection tools (DPI) that inspect traffic to block data leaks and malware. The one way of solving this problem is to effectively distribute traffic from 100Gb network links to the security tools running on the lower speed links to mitigate the gap between the higher data rate of the core network and the lower data processing capacity of the tools to optimize the functionality offered by each tool. To do that sophisticated load balancers are needed in the enterprise infrastructure which is increasing the administration cost and the TCO of the infrastructure. The basic architecture of solving Deep Packet Inspection problem on 100Gbps links are shown below:
FPGA'S ROLE IN DPI
Due to the increasing number of security vulnerabilities and network attacks, the number of Regular Expressions (RE) in DPI is constantly growing. At the same time, the speed of networks is growing too—telecommunication companies started to deploy 100 Gbps links, the 400 Gbps Ethernet standard has recently been ratified, and large data centers already call for a 1 Tbps technology. Consequently, despite many proposed optimizations, existing DPIs are still far from being able to process the traffic in current high-speed networks at the line speed. The best software-based solution we are aware of is the one that can achieve a 100 Gbps throughput using a cluster of servers with a well-designed distribution of network traffic. Processing network traffic at such speeds in single-box DPIs is far beyond the capabilities of software-based solutions—hardware acceleration is needed.
A well-suited technology for accelerating DPIs is that of field-programmable gate arrays (FPGAs). They provide high computing power and flexibility for network traffic processing, and they are increasingly being used in data centers for this purpose.
Why choose FPGA as an acceleration platform? Well, there are several reasons for that.
- Performant enough as an ASIC for certain workloads
- Flexible enough to reconfigure, change schemas, test the market, proof the solution, adjust development, build a viable product based on customer feedback
Meanwhile, FPGAs have their cons as well. It is extremely hard to build a solution on the FPGA silicon, just like building ASIC design which yields to the slow FPGA market adaptation as a default computing unit.
Let's take a deeper look at the FPGAs to understand what is under the hood of these chips.
A field-programmable gate array (FPGA) is an integrated circuit (IC) that can be programmed in the field after manufacture. The FPGA configuration is generally specified using a hardware description language (HDL), similar to that used for an Application-Specific Integrated Circuit (ASIC). FPGAs contain an array of programmable logic blocks, and a hierarchy of "reconfigurable interconnects" that allow the blocks to be "wired together", like many logic gates that can be inter-wired in different configurations. Logic blocks can be configured to perform complex combinational functions or merely simple logic gates like AND and XOR. In most FPGAs, logic blocks also include memory elements, which may be simple flip-flops or more complete blocks of memory. Many FPGAs can be reprogrammed to implement different logic functions, allowing flexible reconfigurable computing as performed in computer software. The simplified schematic view of the FPGA chips are shown below:
Logic blocks - allow designing digital circuits which perform computation
Interconnect - allows connecting your logic blocks to design complex and large designs
IO Blocks - allows interacting with the different interfaces, network, storage, server’s buss
*everything is programmable
*everything is reconfigurable. Change your firmware in milliseconds.
*after FPGA design is successfully implemented you can move forward to produce ASIC immediately if necessary.