Auditing API’s

One of my current projects is to write a general purpose auditing framework for instrumenting open source and enterprise level applications for audit event logging. This is the ARF component of the bandit project. That should be fairly simple, right? It sounds easy, but (as they say) the devil is in the details. What sort of API can log events for any application, any domain, any process?

My first thought was to create a tuple-based API that would allow anyone to add any number and combination of name/value pairs to an audit record before submitting that record to the record stream. This is a very general purpose interface. In fact, add a little hierarchy and it looks a lot like SAX or DOM interfaces for creating and managing XML documents. And that’s where I decided that this was probably not the right approach. We don’t need yet another XML interface. I could use SAX or DOM in fact, but it just seems like overkill – even for a generic application like general purpose audit instrumentation.

The second approach that presented itself to me was to define a common audit record taxonomy, or a hierarchy of record formats that define the data necessary for most audit records required in today’s world. Done as a hierarchy, this is also an expandable option – more record formats can be added later to accomodate new uses of audit information.

It so happens that Novell has just acquired eSecurity, a world-class auditing company. eSecurity is a great fit for Novell because they excel at collecting and classifying audit event data. However, they don’t have an API for application instrumentation. Given their strategy before Novell, this was a great wait to limit their scope to an area in which they could excel. It maintains focus and allows their engineers to create a product that can’t be beat.

The nice thing (from my perspective) about the eSecurity acquisition is that the ARF project can be used to front the solid eSecurity audit event taxonomy. Hand in glove, so to speak.

This sort of API might look much more like the taxonomy fields than simple tuples. This is better for all involved. Data submitted to ARF will look more structured and less free-form (as is the case with XML – schema notwithstanding).