A brief analysis of the integration solution of Hyperscan in nDPI

A brief analysis of the integration solution of Hyperscan in nDPI

Labs Guide

Hyperscan is a high-performance regular expression matching library launched by Intel, which is very suitable for deployment in solutions such as DPI/IPS/IDS. nDPI is currently a widely used open source DPI source code library. When the nDPI source code is secondary developed and deployed in resource-constrained router products, its core module consumes a lot of memory.

Part 01, Introduction to nDPI Framework

Figure 1 nDPI framework diagram

nDPI is a very popular open source DPI library maintained by ntop. It supports both Windows and Unix/Linux systems and supports cross-platform experience [1] . As shown in the framework diagram of Figure 1, the nDPI library mainly consists of a network data acquisition module, a data preprocessing module, a protocol detection and matching module, and feature library related modules. The data acquisition module collects data from the network card in real time or parses existing pcap files; the data preprocessing module is responsible for receiving network data, data grouping, data shaping and filtering; the protocol detection and matching module is the core module of nDPI, which performs rule matching on the shaped message data according to the existing protocol rule feature library. The performance of the matching algorithm and the consumption of hardware resources such as memory and CPU during the matching process are crucial to the entire system.

Part 02: Integration of Hyperscan in nDPI

Figure 2 DPI framework diagram integrated with hYperscan

The integration of Hyperscan and nDPI focuses on the following two aspects:

  • String multi-pattern matching

An important matching process in nDPI is multi-mode matching of strings. Multi-mode matching of strings can quickly filter out unmatched rules to reduce the number of rules that need to be matched one by one, thereby improving matching performance. nDPI uses the Aho-Corasick algorithm for multi-string matching. Since the native Aho-Corasick algorithm needs to convert all rules into a Trie tree structure, it occupies a large amount of memory. Hyperscan has its own optimized matching engine for matching, which greatly reduces the memory consumption during the matching process. We replaced this algorithm with Hyperscan, which reduced memory consumption and CPU usage, and brought significant performance improvements.

  • Http preprocessing

In addition to the integration of the engine's matching algorithm, we also added Hyperscan to the preprocessor module. During HTTP preprocessing, Hyperscan is used to search for relevant keywords to further speed up the preprocessing process.

Part 03, Memory Optimization

We selected the MT7981B chip + OpenWrt system as the test platform, used the protocol rule library file that comes with nDPI for testing, and used the data packets collected from the real network card as data input. As shown in Figure 3, the native nDPI memory consumption is large, 56MB, while the nDPI + Hyperscan solution reduces the memory consumption to 5.7MB, which is only one tenth of the memory usage of the native nDPI.

Part 04. Conclusion

The memory consumption of nDPI after Hyperscan integration is much lower than that of the original nDPI. The memory resources in existing embedded network equipment products are very tight. The optimization of memory resources by nDPI components after integrating Hyperscan is conducive to the deployment of DPI and related products in end-side embedded network products.

<<:  A brief discussion on the development history and future trends of routers

>>:  A Brief Analysis of Bluetooth MESH Broadcasting

Recommend

How to help enterprises improve the WiFi performance of wireless LAN?

Assuming your company has no money for a wireless...

Difference between web scraping and web crawling

People sometimes mistakenly use the terms “web sc...

Seven common misconceptions about the 802.11ax wireless LAN standard

For the 802.11ax wireless LAN standard, which is ...

Three considerations to spark innovation on the modern web

Today’s networks may not adapt well to changing n...

Software-defined architecture enables network optimization for cloud access

Everyone is talking about the huge changes that c...

Spring is coming, the cancellation of data roaming charges? Beware of scams

Mr. Dongguo and the wolf, Lu Dongbin and the dog,...

Gartner: China's IT spending is expected to grow 7.7% in 2021

According to the latest forecast by Gartner, the ...

Read the history of instant messaging IM in one article

ICQ, the instant messaging software we are more f...

Game lag? Be careful to use the wrong WiFi frequency at home

When you use WiFi at home to play games, you alwa...

Damn it, Xiaolin is playing tricks on me again!

Hello everyone, I am Xiaolin. A few days ago, a r...

Linux common command find record

The tribe recorded some usage of the find command...

It’s time to launch 5G applications

Recently, ten departments including the Ministry ...