Snort++ worker Overview

The Snort++ developers recently enabled the packet processing threads to offload tasks to "worker" threads. The packet threads offload tasks related to the Multi-Pattern Search Engines (MPSE) for pseudo packets. In other words, the packet threads continue to do all of the decoding (codecs) and service inspection for all of the packets (real and pseudo) as well as the MPSE searches for the real packets. The only thing that the packet threads offload to the worker threads is the MPSE work for the pseudo packets. And if the MPSE finds a match with a rule's fast-pattern content, the packet threads (not the worker threads) check whether the remaining options for the rule match (recall that the MPSE only looks for the fast-pattern contents for rules and does not check any other options).
Let's see how the worker threads are leveraged for a TCP (HTTP) connection.
name:

website:

comment:


Background

TCP connections are described in much more detail in the TCP example. This is just a refresher to understand how a worker thread fits into the big picture.

The first three packets in a TCP connection are the three-way handshake. As described elsewhere, the TCP TcpTrackers make sure that the SEQ and ACK numbers in the three way handshake are valid
In order to simplify the discussion, I will assume that the ACK packets in this TCP connection do not contain data. This is allowed but is often not the case. For example, a client often sends data (i.e., payload) in the third packet (the ACK packet) of a three-way handshake.
.
The first packet with data (i.e., the payload) in the HTTP connection is the client's request/header packet.
All of this data is put into one of the two TcpReassembler's queue (specifically, the server queue since the server will receive the packet). Nothing else is done with the data at this time. It's important to understand that this data isn't processed until an ACK for this data is seen by Snort++
This is the case for Snort++ in IDS (Intrusion Detection Systems) mode but not the case for Snort++ in IPS (Intrusion Prevention Systems) mode. I do not cover Snort++'s IPS mode in this documentation.
.
When the ACK for this data does come, the server's TcpReassembler invokes the Wizard (more specifically, the Wizard's associated Splitter, MagicSplitter, is invoked) to determine the appropriate service Inspector for the packet's Flow. If the Wizard sees the string "GET" in the data, it will assign HttpInspect as the service Inspector. HttpInspect's associated splitter (HttpInspectSplitter) then breaks the data up into pseudo packets (PDU's). The first pseudo packet contains the request.
Snort++ interrupts the processing of the ACK packet and sends this pseudo packet through the pipeline
Since Snort++ created the headers for the different protocols (e.g., Ethernet, IP, TCP) of a pseudo packet, Snort++ knows that the headers are fine and the Decoders are not called to look at these headers.
. The service Inspectors are interested in the relevant pseudo packets and therefore HttpInspect will inspect the data within the first pseudo packet. This pseudo packet (or any portion of its payload) will not be compared against the MPSE at this time
HttpInspect waits to send the URI (not the complete request) from the request through the MPSEs until it processes the second pseudo packet (the header).
and so the pseudo packet will be then inspected by the probe Inspectors and will then be done with the pipeline.
The second pseudo packet is also sent through the pipeline.
During the "Rule Detection Engine (MPSE)" phase, the URI and the header will be offloaded to the worker thread. After the second pseudo packet is run through the packet-processing pipe (but perhaps before the MPSE work is done by the worker thread), the packet thread finishes processing the ACK packet (which will be trivial since this ACK packet did not contain payload). Finally, at some later point, the worker thread will finish with its MPSE work and the packet thread will finish processing this second pseudo packet.
Let's look in detail how this works.
name:

website:

comment:


worker Threads

The "worker" threads are created during Snort++'s initialization as Snort++ initializes the packet threads. Each packet thread is assigned a set number of associated worker threads
The number of worker threads is configured with the detection.offload_threads snort.lua configuration parameter.
. Each new worker thread (notice the first argument in std::thread() from the link) executes RegexOffload::worker, which is an endless loop that simply waits for its associated packet thread to put a pseudo packet in its queue (i.e., it "offloads" the pseudo packet) so that the worker thread can run the pseudo packet against the appropriate MPSEs.
name:

website:

comment:


RegexRequest

The packet threads use RegexRequests to send the pseudo packets to the worker threads. Initially, all of the RegexRequests are in the packet thread's idle list.
Later, when a packet processing thread wishes to offload a pseudo packet to a worker thread, it moves one of the RegexRequest objects from the idle queue to the busy list, links the RegexRequest object to the pseudo packet, and requests that the worker thread associated with the RegexRequest (using the notify_one() function) compare the pseudo packet against the appropriate MPSEs.
name:

website:

comment:


ContextSwitcher and IpsContext

We've just seen how Snort++ packet threads offload a pseudo packet's MPSE work to worker threads. We've also seen how Snort++ interrupts the processing of an ACK packet to work on the pseudo packet that is generated as a result of the ACK packet. In order to keep track of which packet (real or pseudo) that a packet thread is processing and which pseudo packet has been handed off to worker threads, Snort++ uses ContextSwitcher and IpsContext objects. Each packet thread has an associated ContextSwitcher and each ContextSwitcher has 20 IpsContext objects associated with it
This number - 20 - is apparently arbitrary.
. ContextSwitcher has 3 vectors in which it keeps the IpsContext objects - idle, busy, and hold
Unfortunately, a packet thread and RegexOffload both have idle and busy vectors and it's sometimes difficult to keep them straight.
. idle and busy are used as stacks (Last-In First-Out queues) and hold contains "slots" in which IpsContext objects can be placed (we'll see how the hold vector is used in a moment).
After Snort++'s initialization, all IpsContext objects are in the idle vector.
name:

website:

comment:


worker Threads Example

Let's revisit our HTTP example above to see how these vectors are used. When the first ACK that acknowledges payload arrives, the ACK packet is associated with an IpsContext object and moved to the busy stack. As long as a packet (for example, this ACK packet) is at the back of the busy stack, it is moving through the pipeline.
I described the processing of an ACK packet above but let me repeat it here. The ACK packet acknowledging the first packet with payload triggers the Wizard, which attempts to find the appropriate service Inspector. In this example, that will be HttpInspect since it finds the "GET" string. HttpInspect's associated splitter (HttpInspectSplitter) then breaks the data up into pseudo packets (PDU's). The first pseudo packet contains the request.
Before Snort++ sends this pseudo packet through the pipeline, it associates an IpsContext object from the idle vector with this pseudo packet ("pseudo packet 1" in the diagram below) and moves the object to the back of the busy vector
Since Snort++ creates the headers for the different protocols (e.g., Ethernet, IP, TCP) of a pseudo packet, Snort++ knows that the headers are acceptable and so the Decoders are not called to look at these headers.
.
HttpInspect is interested in the pseudo packets and will therefore inspect the data within the pseudo packet. It's important to understand that the service inspectors inspect the pseudo packet within the packet thread. This is in contrast to the detection
"Detect" is a fairly ambiguous term in the Snort++ code. The method DetectionEngine::detect() invokes the MPSE to search for the fast-patterns from all the rules and then evaluates the remaining options for the rules from the MPSE fast-pattern matches. On the other hand, the DetectionEngine class encompasses both the service Inspectors and the MPSE work. Here I use the term "detection" to refer to the MPSE work.
, which is initially done in a worker thread and completed back in the packet thread. The work that is done in the worker thread is the MPSE fast-pattern search using the Aho-Corasick method and the work that is done in the packet thread is the later evaluation of the remaining options for the rules from the MPSE fast-pattern matches.
That being said, pseudo packet 1 is not sent to the MPSEs for reasons explained elsewhere and so will be removed from the busy vector after it's done in the pipeline.
After pseudo packet 1 has gone through the pipeline, the IpsContext associated with pseudo packet 1 is bumped off the busy vector and the IpsContext at the back of the busy vector (which corresponds to the ACK packet) becomes active (i.e., continues through the pipeline). Earlier we saw that pseudo packet 1 was created when the ACK packet was active. Since there is still payload that was ACK'ed by the ACK packet, pseudo packet 2 will also be created. Pseudo packet 2 is then associated with an IpsContext and since it is at the back of the busy vector, pseudo packet 2 will become active.
Pseudo packet 2 will work its way through the pipeline just as pseudo packet 1 did
but when the "Rule Detection Engine (MPSE)" phase is reached and the payload from the pseudo packets is offloaded to a worker thread for MPSE work, the IpsContext associated with pseudo packet 2 will be moved to the hold queue. At this point, since the IpsContext associated with the ACK packet is at the back of the busy vector, the ACK packet can continue through the pipeline. RegexRequest must be moved from the offloader's idle vector to the busy vector
As mentioned earlier, both s_switcher and offloader have idle and busy vectors and it's easy to get them confused.
and must then be associated with pseudo packet 2.
At some point, the ACK packet will be finished in the pipeline. (After the generation of the pseudo packets, there's very little that needs to be done for the ACK packet in the pipeline - therefore, the ACK packet will likely complete before the MPSE work for pseudo packet 2 has finished.)
Later, the worker thread will complete its work (which consists of comparing the URI against the PM_TYPE_KEY MPSE and the header against the PM_TYPE_HEADER MPSE) and the IpsContext will be returned to the busy vector.
And after pseudo packet 2 goes through the pipeline (which consists primarily of verifying that the payload matches the other options in a rule), the IpsContext is returned to the idle vector.
name:

website:

comment: