Tainted Bugs (or, Automatically detecting XSS security holes)

With apologies to Gloria Jones and a variety of others...

Sometimes I feel there has to be a way
To improve securi-tay
To automatically prevent attacks
The bugs we fix seem not to help one bit
To make the exploit-tays
Not come back. They should stay away!
Oh! Tainted bugs!

As part of Acquia's security testing for Acquia Drupal, I've been experimenting with automated methods for detecting security vulnerabilities in Drupal and contributed modules. The time has come to report on my progress. If you want to learn more about this and are going to DrupalCon Hungary 2008, vote for my session proposal.

Data tainting is a kind of dynamic program analysis (it operates while the program is running) that can automatically detect one of the most frequent sources of security vulnerabilities: insufficiently validated user input. The idea behind data tainting is that when data is received from an untrusted source (such as GET request parameters or some parts of the database), it is marked as "tainted." Any data derived from tainted data (such as by string concatenation, passing function arguments, etc.) is also marked tainted. When tainted data is passed to a security-sensitive destination (such as the HTML response to a page request), a taint error is raised. Finally, when tainted data is validated in specific ways, the taint mark is removed so the data can be used freely.

What I am calling "Taint Drupal" is based on Taint PHP work by Wietse Venema along with some Drupal-specific customization particularly regarding the database. For more details, keep reading.