Computer system attackers usually comprise computer systems through taking over the execution of programs and executing their own code. Because alien code is executed when an intrusion occurs, the run-time behavior of the attacked program usually is different from what is defined by its creators. System calls are powerful tools to get computer system privileges, so usually an intrusion is accompanied with the execution of non-expected system calls. Due to the above reason, if the system-call execution pattern of a program can be collected before it is executed and is used to compare with the run-time system-call execution behavior, then non-expected execution of system calls can be detected.
PAID(Program-semantics Aware Intrusion Detection) is a kernel-compiler patch which uses the above principle to detect computer system intrusions. PAID automatically extracts system-call patterns of programs through parsing the source files and then uses the info to compare with the run-time system-call execution patterns of programs to detect intrusions.
PAID uses compiler techniques to extract from a network application the set of system calls contained within and their control flow. More concretely, given a program, PAID derives a data structure that keeps track of the set of system calls that are allowed to follow each system call in the program or that start the program execution. At run time, PAID simply checks an incoming system call against the list of allowed system calls at that program execution point, and lets the system call proceed only when it is a member of the list. Figure 1 shows the compile-time extraction of system call patterns from an application and the system call legitimacy check of the application at run time.
Given an input program, PAID first builds a function call graph (FCG) and a per-function control ow graph (PFCFG). Then it converts into empty nodes all nodes in the FCG that correspond to those functions that do not directly make system calls. This means that the associated PFCFGs are ignored in subsequent processing. Finally it reduces each PFCFG into a form that contains only the control ow information and the system calls. That is, all other instructions are removed. Given a program's reduced FCG and reduced PFCFGs, PAID starts with the root of the reduced FCG, and performs a breadth-first traversal until it reaches a system call in each path. As the FCGs being traversed, each FCG node is expanded into its associated PFCFG. The set of system calls reached from the start of a program form the initial system call set, which is the set of system calls that are allowed to start this program. For each member in the initial system call set, PAID performs the same breadth-first traversal to compute the following system call set for the system call member in question. The result of this algorithm is thus a following system call set for each system call in the program as well as the initial system call set associated with the program's starting point.
Simply enforcing an application's system call sequencing pattern at run time does not provide much deterrence because the attacker's code, once taking control, can emulate the victim application's system call ow until it reaches the desired system call. Of course, this is assuming that the attacker can exploit one of the system calls contained in the victim application to in ict damages on the victim system. PAID eliminates this security hole by in-lining each system call with the corresponding sequence of instructions, including the software trap instruction (INT 80H in the case of Intel X86 architecture) that transfers control to the kernel. Through procedure in-lining, each system call site in a application now has a unique ID, which is the address of the software trap instruction. Invocations of the same system call at different places in an application program actually have different IDs. Accordingly, each member of the initial system call set and of the follower system call sets is now represented in terms of its ID, rather than its system call type.
When the software trap instruction of a system call is executed, the address of the instruction immediately following the software trap instruction is stored as the return address of the software trap. Therefore, when a software trap occurs, PAID only needs to check the associated return address against the initial or follower system call sets to determine whether the corresponding system call should be allowed to proceed. With the augmentation of per-system-call ID, PAID makes sure that each legitimate system call has to be issued from a proper memory location in the application's address space AND be preceded by a proper sequence of system calls. The check on on system call ID makes it impossible for the attacker code to emulate the statically determined system call ow, because it cannot get the control back once it transfers the control to a particular system call site in the program. The check on system call sequencing constraint eliminates the possibility that the attacker code can directly jump to a system call in the application that has damaging power.
The fundamental assumption underlying the compiler-driven system call
pattern checking mechanism is that it is possible to statically determine the
exact system call sites and patterns of a network application. This assumption
does not hold for programs that contain JUMP instructions whose target address
is not known until run time. Such jump instructions could result from function
pointers or switch statements in C. Because of these dynamic constructs, the
compiler can only statically compute a partial follower system call set for
system call instances that are immediately followed by a JUMP instruction with
dynamic target addresses. To complete these follower system call sets at run
time, PAID inserts a notification system call before each dynamic JUMP
instruction, e.g., a function call that is invoked through a function pointer,
to inform the kernel the actual target address of the JUMP. With this
information, Paid can augment each partial follower system call with the set of
system calls that can follow the target of the JUMP. Although these
notification system calls incur additional performance overhead, this overhead
is expected to be minor, because the occurrence frequency of dynamic JUMP
instructions is relatively low in real-world network applications. Because
notification calls are themselves system calls and therefore are protected
through the system call site/pattern check mechanism, attackers cannot insert
code to issue notification system calls and explicitly change the system call
pattern.
Currently our software works under Redhat 7.2 and 7.3. We did not test our software thoroughly, and it may still contain a lot of bugs. If you want to use the software, you need to download the Linux kernel and the module from this site and recompile your kernel.