RDB: Repairable Database Systems


Project Members



Goals of the Project

Suppose you are a database administrator and you have just found out that your database is corrupted by a malicious intrusion or a human entry error that occurred 24 hours ago. What do you do in such a situation? You can roll back the database in question back to the image 24 hours ago, thus throwing away all the "good" transactions that took place during this period of time, or you can laboriously sift through the set of transactions to manually distinguish between "good" and "bad" transactions, thus increasing the repair time and decreasing the database's availability. RDB is a technology that can take you out of this dilemma, and moreover, it can do so in a way that is completely transparent to the DBMS server and its clients. Currently, RDB has been ported to Oracle, Sybase, and PostgreSQL, and the run-time overhead of RDB is shown to be between 8% and 13%.

The recovery mechanism of existing database management systems is inadequate for repairing database damage caused by malicious attacks and operational errors, because these events cannot be detected immediately after they occur. An intrusion-tolerant database management system is one that is capable of restoring its consistency while preserving as much useful data as possible after an intrusion. RDB is a portable intrusion resilience implementation framework that can add intrusion tolerance to existing commercial database management systems without requiring any modifications to their source code. Equipped with such an intrusion resilience mechanism, a database management system can effectively nullify the effect of malicious/erroneous transactions, while preserving the results of legitimate ones. The proposed intrusion resilience mechanism supports fast and automatic post-intrusion database recovery, thus greatly improving a database management system's overall availability. In addition, it also allows administrators to be involved in the database repair process by an interactive interface and visualization aid.

System architecture

RDB includes an inter-transaction dependency tracking mechanism that maintains the dependency relationships among transactions at run time, and a selective undo mechanism that can roll back only those transactions that are determined to be corrupt at repair time. The system architecture of the inter-transaction tracking mechanism in RDB is shown in Figure 1.

Assume an open database connectivity standard such as JDBC is used to connect a database application to a database. The JDBC proxy driver in Figure 1, which can reside on either client or server machine, intercepts and possibly modifies SQL statements coming from a client to the server and the associated query responses, and transparently infers information about transaction dependencies, e.g., transaction T1 depends on transaction T2 if T1 reads some data modified by T2. This inter-transaction dependency tracking component is very portable across different DBMSs and requires no or minor modifications to tailor it to a specific DBMS.

At repair time, RDB first reads in the transaction log of the underlying DBMS (e.g., Oracle by using its LogMiner utility), builds a complete transaction dependency graph, presents a visualization for this graph to the DBA, collects feedbacks, and performs selective undo of those transactions that are chosen by DBA and the system. The only part that is DBMS-dependent is the transaction log parsing part. The interactive visualization tool for transaction dependency graph is gives the DBA the flexibility of incorporating domain-specific and application-specific knowledge into the database repair process. An sample visualized dependency graph is shown below.

Publications and Presentations


Alexey Smirnov 2005-04-03