Data Integrity: AIDE for Host Based Intrusion Detection

It used to be all the talk, everyone knew it, accepted it but few did anything about it and still even today, very few do anything about it. What is it? Data Integrity. But it is not in the form of how we usually look at data integrity; it is not backups, raid management or similar — it is host based intrusion detection.

What is host based intrusion detection (hIDS)? In it simplest form it is basically the monitoring of a file system for added, deleted or modified content, for the purpose of intrusion detection and (post) compromise forensic analysis. At one time hIDS was a very popular topic with allot of emphasis pushed on it from the security community and although it still is an area of religious focus for some, it is generally a very under utilized part of a well rounded security and management policy. Note how I said management policy there also, as hIDS is not just about intrusion detection but can also play a vital role in day-to-day operations of any organization by providing “change monitoring” capabilities. This can play out in many scenarios but the simplest being that it allows you to track changes to file systems made through regular administration tasks such as software installations, updates or more importantly administrative mistakes. Though the topic of change monitoring can be a whole article in of itself, hIDS to me is vitally important in both respects as an intrusion detection AND change monitoring resource.

I can not beat around the fact that even myself, over the years, have let hIDS fall to the wayside, I used to be the biggest fan of tripwire and would use it on everything. However, over time tripwire became a time consuming, bloated and difficult tool to manage, it is also tediously slow and would cause very undesirable loads on larger systems. This made for hIDS falling out of my regular security and management habits which in turn had a way about sneaking up on me and biting me in the butt whenever a system got compromised or an administrator would make a “oopsy” on a server.

A few years back I experimented with a tool called AIDE (advanced intrusion detection environment), at the time it was the new kid on the block but showed incredible potential with a very simplified configuration approach, fast database build times and reasonably modest resources usage — by tripwire standards it was exactly what I was looking for, simple and fast. AIDE has since grown up a bit, many of the small issues I used to have with it are now fixed and it is now available in the package management for most major distributions including FreeBSD, Ubuntu, Fedora & RHEL (CentOS).

The configuration and deployment scenario we are going to look at today is one that is suitable for web and application servers but really can be broadly applied to just about any system. We are going to slightly sacrifice some monitoring attributes from files on the system in the name of increasing performance and usability while maintaining a complete picture of added, deleted and modified files. So, let’s jump right on in….

The first task is we need to install AIDE, for the purpose of this article I am assuming you are using Fedora or an RHEL based OS (i.e: CentOS), so please refer to your distributions package management or download and compile the sources at if a binary version is not available for you.


# yum install -y aide


The binary default installation paths for AIDE place the configuration at /etc/aide.conf , executable at /usr/sbin/aide and databases at /var/lib/aide/. The obviously important part being the configuration file so lets get a handle on that for the moment. The configuration defaults are a little loud, intensive and in my opinion will overwhelm anyone who has never used hIDS before; even for myself the defaults were just too much. That said, we are going to backup the default configuration for reference purposes and download my own custom aide.conf:


# cp /etc/aide.conf /etc/aide.conf.default
# wget -O /etc/aide.conf
# chmod 600 /etc/aide.conf


This configuration was created for a WHM/Cpanel server, it is however generalized in nature and can apply to almost any server but will require modification to keep noise to a minimum. Now I stress that fact, noise — hIDS reports can get very loud if you do not tune them and that can lead to them being ignored as a nuisance but more on that later. Lets take a look at the configuration file we just downloaded and I will attempt to break it down for you by each section:

# nano -w /etc/aide.conf
( or your preferred editor *ahem vi* )

The first 10 or so lines of the file declare the output and database paths for AIDE, they should not be edited, the first parts we want to look at follow:

# Whether to gzip the output to database

# Verbose level of message output - Default 5

These options speak for themselves; do we want to gzip the output databases? No, we do not as our management script that we will run from cron and look at later is going to take care of that for us. Next is is the verbosity level (0-255 — less to more) which defaults at 5. The verbosity is fine left at the default, you can lower it to 2 if you want strictly add/delete/modified info in the reports with NO EXTENDED information on what attributes were modified on files (i.e: user, group, permissions, size, md5) — suitable maybe for a very simplified change management policy. If set to 20 then reports will be exceedingly detailed in item-by-item change information and reports can become massive — so I recommend leaving it at the default of 5 for the best balance of detail and noise reduction.

Next the configuration file lists, in comments, the supported attributes that can be monitored on files and paths and then our default monitoring rules of what attributes we will actually use; this list shows the depth of AIDE and should be reviewed in brief for at least a fundamental understanding of what you are working with:

# These are the default rules.
#p:     permissions
#i:     inode:
#n:     number of links
#u:     user
#g:     group
#s:     size
#b:     block count
#m:     mtime
#a:     atime
#c:     ctime
#S:     check for growing size
#md5:    md5 checksum
#sha1:   sha1 checksum
#rmd160: rmd160 checksum
#tiger:  tiger checksum
#haval:  haval checksum
#gost:   gost checksum
#crc32:  crc32 checksum
#E:     Empty group
#>:     Growing logfile p+u+g+i+n+S

# You can create custom rules like this.

LOG = p+u+g
DIR = p+u+g+md5

The important parts here that we will be using, and can be seen from the custom rules, are p,u,g,s,md5 for permissions, user, group, size and md5 hashes. How does this work in our interest? The basics of permission, user, and group are fundamentals we would always want to be notified of changes on, as really, those are attributes that shouldn’t ever change without an administrator doing so intentionally (i.e: /etc/shadow gets set 666). Then there is size and md5 which will tell us that a file has been modified, though we are not specifically tracking mtime (modified time), it is not strictly needed as md5 will tell us when even a single bit has changed in a file and mtime is an easily forged attribute (although feel free to add m to the R= list for mtime tracking if you desire it).

Then we have the paths to be monitored which you’ll note we are not monitoring on the top level ‘/’ itself but instead a specific list. Although you can monitor from the top level, it is not recommended on very large servers, if you do choose to monitor from the top level then be sure to add ‘!/home’ and other heavily modified user paths into your ignore list (covered next), especially if you have a shared hosted environment. Keep in mind, this is not about monitoring every single user level change but rather the integrity at the system (root) or critical application/content level.

/etc    NORMAL
/boot   NORMAL
/bin    NORMAL
/sbin   NORMAL
/lib    NORMAL
/opt    NORMAL
/usr    NORMAL
/root   NORMAL
/var    NORMAL
/var/log      LOG

## monitoring /home can create excessive run-time delays
# /home   DIR

As mentioned above, monitoring of /home is not the best of ideas, especially on larger servers with hundreds of users. The exception to this rule is smaller servers that are task oriented towards mission critical sites or applications. In these situations, such as my employers and even my own web server that have no other task than to host a few sites, monitoring of /home can be invaluable in detecting intrusions in your web site and web applications. This is especially true if you run billing, support forums, help desks and similar web applications on a single server dedicated to your businesses corporate web presence. So, the take away here is — monitor /home sparingly and evaluate it on a case-by-case basis.

Now, onto our ignore list which is as simple as it gets — any paths that are not subject to monitoring for whatever reason, be it too heavily modified or just administratively not suitable to be reported on.


Generally speaking, you do want to limit the paths ignored as every ignored path is a potential area that an attacker can store malicious software. That said though, we are trying to strike a balance in our reports that alert us to intrusions while still being reasonable enough in length to be regularly reviewed. The important thing to remember is although an attacker can hide content in these ignored paths, to effectively compromise or backdoor a server, the attacker needs to replace and modify a broad set of binaries and logs on the server, which will stand out clearly in our reports. Nevertheless, remove any paths from the ignore list that may not apply to your environment or add too it as appropriate.

That’s it for the configuration side of AIDE, hopefully you found it straight forward and not too overwhelming, if you did then google tripwire and you’ll thank me later 😉

The next part of our AIDE installation is the management and reporting component. The approach we will be taking is using a management script executed through cron daily, weekly or monthly to perform maintenance tasks and generate reports, which can optionally be emailed. The maintenance consists of compressing and rotating our old AIDE databases and logs to time stamped backups along with deleting data that has aged past a certain point.

# wget -O /etc/cron.weekly/aide
# chmod 755 /etc/cron.weekly/aide

The default for this article will be to run AIDE on a weekly basis, this is what I recommend as I have found that daily creates too many reports that become a burden to check and monthly creates reports that are far too large and noisy — weekly strikes the right balance in report size and frequency. The cron has two variables in it that can be modified for email and max age of databases/logs, so go ahead and open /etc/cron.weekly/aide with your preferred editor and modify them as you see fit.

# email address for reports

# max age of logs and databases in hours
# default 2160 = 90 days

The e-mail address variable can be left blank to not send any emails, if you choose this then reports can be manually viewed at /var/lib/aide/aide.log and are rotated into time stamped backups after each execution (i.e: aide.log.20110315-162841). The maxage variable, in hours, is the frequency at which aide logs and databases will be deleted, which I think 90 days is a reasonable length of retention. However, I strongly recommend for a number of reasons that you make sure /var/lib/aide is included in your remote backups so that if you ever need it, you can pull in older databases for compromise or change analysis across a wider time range than the default last-execution comparison reports.

Although it is not needed, you can go ahead and give the cron job a first run, or simply wait till the end of the week. Let’s assume your like me though and want to play with your new toy 🙂 We will run it through the time command so you can get an idea of how long execution will take in your environment, might also be a good idea to open a second console and top it to see what the resource hit is like for you — typically all CPU but the script runs AIDE as nice 19 which is the lowest system priority meaning other processes can use CPU before AIDE if they request it.


# time sh /etc/cron.weekly/aide


Let it run, it may take anywhere from 10 to 60 minutes depending on the servers specs and amount of data, for very large servers, especially if you choose to monitor /home, do not be surprised with run times beyond 60 minutes. Once completed check your email or the /var/lib/aide/aide.log file for your first report and that’s it, you are all set.

Two small warnings about report output, the first is that when you perform software updates or your control panel (i.e: WHM/Cpanel) does so automatically, you can obviously expect to see a very loud report generated. You can optionally force the database to regenerate when you run server updates by executing ‘/usr/sbin/aide –init’ and this will keep the next report nice and clean. The second warning is that sometimes the first report can be exceedingly noisy with all kinds of attribute warnings, if this happens give the cron script (/etc/cron.weekly/aide) a second run and you should receive a nice clean report free of warnings and noise.

For convenience, I have also made a small installer script that will take care of everything above in my defaults and install AIDE/cron script for you, suitable for use on additional servers after you’ve run through this on your first server.

# wget
# sh install_aide "[email protected]"

I hope AIDE proves to be as useful for you as it has been for me, hIDS is a critical component in any security and management policy and you should take the time to tweak the configuration for your specific environment. If you find the reports are too noisy then please ignore paths that are problematic before you ditch AIDE; if you give AIDE a chance it will be good to you and one day it may very well save you in a compromise or administrative “oops” situation.