R-fx Networks

My Blog

Data Integrity: AIDE for Host Based Intrusion Detection

by on Mar.17, 2011, under HowTo, My Blog

It used to be all the talk, everyone knew it, accepted it but few did anything about it and still even today, very few do anything about it. What is it? Data Integrity. But it is not in the form of how we usually look at data integrity; it is not backups, raid management or similar — it is host based intrusion detection.

What is host based intrusion detection (hIDS)? In it simplest form it is basically the monitoring of a file system for added, deleted or modified content, for the purpose of intrusion detection and (post) compromise forensic analysis. At one time hIDS was a very popular topic with allot of emphasis pushed on it from the security community and although it still is an area of religious focus for some, it is generally a very under utilized part of a well rounded security and management policy. Note how I said management policy there also, as hIDS is not just about intrusion detection but can also play a vital role in day-to-day operations of any organization by providing “change monitoring” capabilities. This can play out in many scenarios but the simplest being that it allows you to track changes to file systems made through regular administration tasks such as software installations, updates or more importantly administrative mistakes. Though the topic of change monitoring can be a whole article in of itself, hIDS to me is vitally important in both respects as an intrusion detection AND change monitoring resource.

I can not beat around the fact that even myself, over the years, have let hIDS fall to the wayside, I used to be the biggest fan of tripwire and would use it on everything. However, over time tripwire became a time consuming, bloated and difficult tool to manage, it is also tediously slow and would cause very undesirable loads on larger systems. This made for hIDS falling out of my regular security and management habits which in turn had a way about sneaking up on me and biting me in the butt whenever a system got compromised or an administrator would make a “oopsy” on a server.

A few years back I experimented with a tool called AIDE (advanced intrusion detection environment), at the time it was the new kid on the block but showed incredible potential with a very simplified configuration approach, fast database build times and reasonably modest resources usage — by tripwire standards it was exactly what I was looking for, simple and fast. AIDE has since grown up a bit, many of the small issues I used to have with it are now fixed and it is now available in the package management for most major distributions including FreeBSD, Ubuntu, Fedora & RHEL (CentOS).

The configuration and deployment scenario we are going to look at today is one that is suitable for web and application servers but really can be broadly applied to just about any system. We are going to slightly sacrifice some monitoring attributes from files on the system in the name of increasing performance and usability while maintaining a complete picture of added, deleted and modified files. So, let’s jump right on in….

The first task is we need to install AIDE, for the purpose of this article I am assuming you are using Fedora or an RHEL based OS (i.e: CentOS), so please refer to your distributions package management or download and compile the sources at http://aide.sourceforge.net/ if a binary version is not available for you.

# yum install -y aide

The binary default installation paths for AIDE place the configuration at /etc/aide.conf , executable at /usr/sbin/aide and databases at /var/lib/aide/. The obviously important part being the configuration file so lets get a handle on that for the moment. The configuration defaults are a little loud, intensive and in my opinion will overwhelm anyone who has never used hIDS before; even for myself the defaults were just too much. That said, we are going to backup the default configuration for reference purposes and download my own custom aide.conf:

# cp /etc/aide.conf /etc/aide.conf.default
# wget http://www.rfxn.com/downloads/aide.conf -O /etc/aide.conf
# chmod 600 /etc/aide.conf

This configuration was created for a WHM/Cpanel server, it is however generalized in nature and can apply to almost any server but will require modification to keep noise to a minimum. Now I stress that fact, noise — hIDS reports can get very loud if you do not tune them and that can lead to them being ignored as a nuisance but more on that later. Lets take a look at the configuration file we just downloaded and I will attempt to break it down for you by each section:

# nano -w /etc/aide.conf
( or your preferred editor *ahem vi* )

The first 10 or so lines of the file declare the output and database paths for AIDE, they should not be edited, the first parts we want to look at follow:

# Whether to gzip the output to database

# Verbose level of message output - Default 5

These options speak for themselves; do we want to gzip the output databases? No, we do not as our management script that we will run from cron and look at later is going to take care of that for us. Next is is the verbosity level (0-255 — less to more) which defaults at 5. The verbosity is fine left at the default, you can lower it to 2 if you want strictly add/delete/modified info in the reports with NO EXTENDED information on what attributes were modified on files (i.e: user, group, permissions, size, md5) — suitable maybe for a very simplified change management policy. If set to 20 then reports will be exceedingly detailed in item-by-item change information and reports can become massive — so I recommend leaving it at the default of 5 for the best balance of detail and noise reduction.

Next the configuration file lists, in comments, the supported attributes that can be monitored on files and paths and then our default monitoring rules of what attributes we will actually use; this list shows the depth of AIDE and should be reviewed in brief for at least a fundamental understanding of what you are working with:

# These are the default rules.
#p:     permissions
#i:     inode:
#n:     number of links
#u:     user
#g:     group
#s:     size
#b:     block count
#m:     mtime
#a:     atime
#c:     ctime
#S:     check for growing size
#md5:    md5 checksum
#sha1:   sha1 checksum
#rmd160: rmd160 checksum
#tiger:  tiger checksum
#haval:  haval checksum
#gost:   gost checksum
#crc32:  crc32 checksum
#E:     Empty group
#>:     Growing logfile p+u+g+i+n+S

# You can create custom rules like this.

LOG = p+u+g
DIR = p+u+g+md5

The important parts here that we will be using, and can be seen from the custom rules, are p,u,g,s,md5 for permissions, user, group, size and md5 hashes. How does this work in our interest? The basics of permission, user, and group are fundamentals we would always want to be notified of changes on, as really, those are attributes that shouldn’t ever change without an administrator doing so intentionally (i.e: /etc/shadow gets set 666). Then there is size and md5 which will tell us that a file has been modified, though we are not specifically tracking mtime (modified time), it is not strictly needed as md5 will tell us when even a single bit has changed in a file and mtime is an easily forged attribute (although feel free to add m to the R= list for mtime tracking if you desire it).

Then we have the paths to be monitored which you’ll note we are not monitoring on the top level ‘/’ itself but instead a specific list. Although you can monitor from the top level, it is not recommended on very large servers, if you do choose to monitor from the top level then be sure to add ‘!/home’ and other heavily modified user paths into your ignore list (covered next), especially if you have a shared hosted environment. Keep in mind, this is not about monitoring every single user level change but rather the integrity at the system (root) or critical application/content level.

/etc    NORMAL
/boot   NORMAL
/bin    NORMAL
/sbin   NORMAL
/lib    NORMAL
/opt    NORMAL
/usr    NORMAL
/root   NORMAL
/var    NORMAL
/var/log      LOG

## monitoring /home can create excessive run-time delays
# /home   DIR

As mentioned above, monitoring of /home is not the best of ideas, especially on larger servers with hundreds of users. The exception to this rule is smaller servers that are task oriented towards mission critical sites or applications. In these situations, such as my employers and even my own web server that have no other task than to host a few sites, monitoring of /home can be invaluable in detecting intrusions in your web site and web applications. This is especially true if you run billing, support forums, help desks and similar web applications on a single server dedicated to your businesses corporate web presence. So, the take away here is — monitor /home sparingly and evaluate it on a case-by-case basis.

Now, onto our ignore list which is as simple as it gets — any paths that are not subject to monitoring for whatever reason, be it too heavily modified or just administratively not suitable to be reported on.


Generally speaking, you do want to limit the paths ignored as every ignored path is a potential area that an attacker can store malicious software. That said though, we are trying to strike a balance in our reports that alert us to intrusions while still being reasonable enough in length to be regularly reviewed. The important thing to remember is although an attacker can hide content in these ignored paths, to effectively compromise or backdoor a server, the attacker needs to replace and modify a broad set of binaries and logs on the server, which will stand out clearly in our reports. Nevertheless, remove any paths from the ignore list that may not apply to your environment or add too it as appropriate.

That’s it for the configuration side of AIDE, hopefully you found it straight forward and not too overwhelming, if you did then google tripwire and you’ll thank me later πŸ˜‰

The next part of our AIDE installation is the management and reporting component. The approach we will be taking is using a management script executed through cron daily, weekly or monthly to perform maintenance tasks and generate reports, which can optionally be emailed. The maintenance consists of compressing and rotating our old AIDE databases and logs to time stamped backups along with deleting data that has aged past a certain point.

# wget http://rfxn.com/downloads/cron.aide -O /etc/cron.weekly/aide
# chmod 755 /etc/cron.weekly/aide

The default for this article will be to run AIDE on a weekly basis, this is what I recommend as I have found that daily creates too many reports that become a burden to check and monthly creates reports that are far too large and noisy — weekly strikes the right balance in report size and frequency. The cron has two variables in it that can be modified for email and max age of databases/logs, so go ahead and open /etc/cron.weekly/aide with your preferred editor and modify them as you see fit.

# email address for reports

# max age of logs and databases in hours
# default 2160 = 90 days

The e-mail address variable can be left blank to not send any emails, if you choose this then reports can be manually viewed at /var/lib/aide/aide.log and are rotated into time stamped backups after each execution (i.e: aide.log.20110315-162841). The maxage variable, in hours, is the frequency at which aide logs and databases will be deleted, which I think 90 days is a reasonable length of retention. However, I strongly recommend for a number of reasons that you make sure /var/lib/aide is included in your remote backups so that if you ever need it, you can pull in older databases for compromise or change analysis across a wider time range than the default last-execution comparison reports.

Although it is not needed, you can go ahead and give the cron job a first run, or simply wait till the end of the week. Let’s assume your like me though and want to play with your new toy πŸ™‚ We will run it through the time command so you can get an idea of how long execution will take in your environment, might also be a good idea to open a second console and top it to see what the resource hit is like for you — typically all CPU but the script runs AIDE as nice 19 which is the lowest system priority meaning other processes can use CPU before AIDE if they request it.

# time sh /etc/cron.weekly/aide

Let it run, it may take anywhere from 10 to 60 minutes depending on the servers specs and amount of data, for very large servers, especially if you choose to monitor /home, do not be surprised with run times beyond 60 minutes. Once completed check your email or the /var/lib/aide/aide.log file for your first report and that’s it, you are all set.

Two small warnings about report output, the first is that when you perform software updates or your control panel (i.e: WHM/Cpanel) does so automatically, you can obviously expect to see a very loud report generated. You can optionally force the database to regenerate when you run server updates by executing ‘/usr/sbin/aide –init’ and this will keep the next report nice and clean. The second warning is that sometimes the first report can be exceedingly noisy with all kinds of attribute warnings, if this happens give the cron script (/etc/cron.weekly/aide) a second run and you should receive a nice clean report free of warnings and noise.

For convenience, I have also made a small installer script that will take care of everything above in my defaults and install AIDE/cron script for you, suitable for use on additional servers after you’ve run through this on your first server.

# wget http://www.rfxn.com/downloads/install_aide
# sh install_aide "you@domain.com"

I hope AIDE proves to be as useful for you as it has been for me, hIDS is a critical component in any security and management policy and you should take the time to tweak the configuration for your specific environment. If you find the reports are too noisy then please ignore paths that are problematic before you ditch AIDE; if you give AIDE a chance it will be good to you and one day it may very well save you in a compromise or administrative “oops” situation.

5 Comments :, more...

LMD 1.3.9: Quietly Awesome

by on Mar.16, 2011, under Development, My Blog

It has been a busy couple of weeks for the LMD project, lots of late nights and sleepless days behind me and I can say I am a ‘little’ happier with where things are in the project now πŸ™‚

This release has no major feature changes or additions other than a modification in the default hexdepth that is used to scan malware; increased from 15,736 to 61,440 (1024*60). This enables LMD to better detect threats that it was having a little difficulty with due to the byte size of some malware. At the moment there is no byte-offset feature that would allow us to create more targeted hex signatures, which kind of fly’s in the face of my goal of improving performance — but it is what it is and for the moment things are O.K. With the new hex-depth value, the miss rate on valid malware that rules already exist for is below 1% and I can live with that till I put together a new scanner engine/logic. You can apply this update to your installations by using the -d|–update-ver flags.

In light of performance concerns with the current incarnation of LMD, I felt it prudent, and also has been requested, to create a set of signatures compatible with ClamAV. This allows those who wish to do so, to leverage the LMD signature set with ClamAV’s very impressive scanner performance. Although there are performance concerns with LMD on large file sets, it does need to be said that for day-to-day operations of the cron initiated or inotify real-time scans, there is little to no performance issues. This is strictly a situation where if you choose to scan entire home directory trees, where file lists exceed tens or hundreds of thousands, you now have an option to take advantage of our signatures within a scanner engine that can handle those large file sets.

The LMD converted ClamAV signatures ship as part of the current release of LMD and are stored at /usr/local/maldetect/sigs/ and are named rfxn.ndb and rfxn.hdb. The ClamAV signatures will not be updated with the usage of -u|–update but rather are static files placed inside the release package when it is rebuilt nightly with new signatures. As such, the latest versions of the rfxn.ndb|hdb files are always available at the following URL’s (these are updated whenever LMD base signature are updated — typically daily):

To make use of these signatures in ClamAV using the clamscan command, you would run a command similar to the following:
clamscan -d /usr/local/maldetect/sigs/rfxn.ndb -d /usr/local/maldetect/sigs/rfxn.hdb -d /var/clamav/ –infected -r /path/to/scan

The -d options specify the virus databases to use, when you use the -d option it will exclusively use those databases for virus scans and ignore the default ClamAV virus database, so we redefine -d /var/clamav/ to also have all of the default ClamAV signatures included in our scans. The –infected option will only display those files that are found to be infected and the -r option is for recursive scanning (descend directory tree’s).

That all said, the real guts of changes recently have been in the signatures themselves, we have in the last two weeks went from 6,083 to 6,769 signatures, an increase of 686 — one of the largest updates to signatures to date for a single month. A great deal of these signatures have come from the submissions queue which ended up in such a backlogged state that there was 2,317 items pending review. Currently I have managed to get the queue cut down to 1,381 and I have committed myself to eliminating the backlog by the end of the month or shortly there after (ya we’ve heard that before right ?). It should be understood that reviewing the submissions queue is an exceedingly tedious task as every file needs to be manually reviewed, assessed on its merits as malware (or deleted), then ultimately hex match patterns created, classify the malware, hash it and insert it into the signatures list. As painful of a process as it is at times, I do enjoy it, it is just very time consuming so please show patience when you submit malware with the checkout option.

Finally, I have allot in store for LMD going forward, as always time is the biggest factor but be assured the project will continue to grow and improve in the best interest of detect malware on your servers. Likewise, for any who have followed previous blogs, the dailythreats website is still a work in progress and will compliment the project once it is released, in the very near future. As always, thank you to everyone who uses my work and please consider a donation whenever possible.

Did You Know? Stats:
Active Installations (Unique IP daily update queries): 5,903
Total Downloads (Project to date): 32,749
Total Malware Signatures: 6,769
rfxn.com Malware Repository: 21,569 files / 1.6G data
Tracked Malware URL’s: 18,304
Tracked IRC C&C Botnets: 421

Leave a Comment :, more...

Happy Birthday APF: 8 Years Strong

by on Mar.09, 2011, under My Blog

On this day eight years ago, Advanced Policy Firewall (APF) version 0.5 for Linux was publicly released. Since then, APF has stood the test of time and still remains to this day, one of the most widely used Linux firewall solutions, with especially high usage in the web hosting industry.

I was 18 years old when APF first met the world, I was a very different person back then and so to was the web hosting industry. There was but a handful of dedicated server providers, it was a time when Cobalt RAQ’s still dominated a large part of the leased server market and white-box leased servers were quickly starting to pickup momentum from providers such as rackshack.net. As every other person tried to start a web hosting business, it quickly became clear on many industry forums and websites that there lacked an easy-to-use firewall solution that was also comprehensive. This is where APF came in, it gave to the masses a simple, usable and stable firewall suite for servers that were typically managed by individuals with very little to no experience in Linux ipchains and later iptables firewalling.

The project has seen in excess of 400,000 downloads to date or 55% of all downloads in the last 10 years on rfxn.com, the latest versions which retrieves data daily from rfxn.com reports that there is over 24,000 active APF installations with these features enabled, countless thousands more with them disabled or legacy releases, and countless more applications ship with APF integrated. In excess of 1,700 separate corporate, governmental and organizational networks report using APF (through ASN tracking) and roughly 260,000 web sites are directly protected by APF (through domainsbyip.com tracking). Any google search on APF or related terms quickly brings up tens of thousands of references to the project in an assortment of installation , usage and best practice guides.

It is clear APF is a force, it is here to stay, sure there is much that can be improved upon it and that will come with time but for now, let us acknowledge that this has been a good 8 years for APF and here is to many more, happy birthday APF and Long live open source software πŸ™‚

7 Comments : more...

Raid Management: Know Whats Really Going On

by on Feb.14, 2011, under HowTo, My Blog

In today’s hosting environment it is common place for servers to have hardware based raid cards but what is not common place is having a reliable method for checking the status of the raid arrays. Few would question the value to data integrity by making use of raid technology but very few organizations and businesses implement the tools required to proactively maintain raid arrays, they simply hope for a DC tech to hear a raid alarm and assume the technician will handle the failure. The reality is very different, data centers are loud and increasingly server-dense so hearing a raid alarm let alone pin-pointing the server with the alarm going off, is a daunting task. I remember more than a few times where I found myself with a paper towel tube to my ear listening server to server to try find that troubled box with the annoying alarm going off. This is not how servers should be managed.

As server administrators or web host operators, it is your responsibility, your duty, to have tools in place that can proactively monitor the status of raid arrays and alert you when an array becomes degraded. That way you can have a paper trail of sorts when something has went wrong, submit a ticket to your data center technicians and have the situation corrected before a degraded array from a single disk failure turns into a multi-disk failure and failed array with data loss.

I have created and been using for sometime a script that can query the status of raid controllers from Areca, 3Ware & MegaRaid. The MegaRaid support is mostly intended for Dell PowerEdge PERC cards, however it should work for most MegaRaid based controllers, ymmv though.

The principle is very simple, the package contains the proprietary command line tools from Areca, 3Ware & MegaRaid that can query the status of respective controllers and then an accompanied ‘check’ script handles determining what raid controller is on the system and then runs the appropriate tool in order to get the raid status and if it is degraded or in any state other than a consistent one, it will dispatch an alert to a configured e-mail address.

Download and extract the package:

# wget http://www.rfxn.com/downloads/raid_check_pub.tar.gz
# tar xvfz raid_check_pub.tar.gz

The package will extract to raid_check/, you should place this under /root/ as the check script expects to be run from /root/raid_check/. If you wish to change the path then please modify ‘raid_check/check’.

With the package now setup under /root/raid_check/ you need to modify the ‘raid_check/check’ script to set an email address that alerts are going to get sent too. Once this is done you should symlink the check script to cron.daily so that raid failures will be picked up on daily cron runs, you may change this to cron.hourly if so desired.

# ln -s /root/raid_check/check /etc/cron.daily/raid_check

That’s it, you can give the check script a run to see if things are working. If there is a failure or inconsistency detected then it will be shown on console in addition to the email alert being sent. If everything is OK and there is no issues detected, then no output will be presented.

# sh /root/raid_check/check

Tip: You can check if your server has a raid card by running the following command:

# cat /proc/scsi/scsi  | grep Vendor

If you see Vendors listed as ATA followed by hard drive model names (i.e: WDC, HD etc…) then your servers disks are directly connected and there is no raid controller present. If on the other hand you see vendor names such as Areca, AMC, 3ware, or MegaRaid then you have a hardware raid controller.

3 Comments :, , more...

Donations: By The Numbers

by on Nov.18, 2010, under My Blog

I was recently asked by someone about the donations that rfxn.com receives, more specifically what it amounts to. In the interest of answering this person and anyone else who may be curious, I thought I would put together a small post about it.

Firstly, what needs to be said is that although the projects have been active for nearly 8 years and will pass 700,000 downloads sometime in January 2011, I have only been accepting donations since late January of 2006. Around this time, is when rfxn.com had a shift from a for-profit managed services provider which offered our projects as a contribution back to the community, to simply a community site dedicated to the projects as I moved into full-time employment.

In that time, almost 5 years, there has been less than 100 donations to the projects (73 to be exact) or an average of 14 donations per year or little more than 1 per month. The average donation amount is $35 and there are typically only two donations per year totaling more than $100. The yearly donation average is $521 and monthly average is $43. There has only been 3 repeat donors and they donate on average of once every 18 months. The sum of all donations to date since January of 2006 is $2,608 USD, with $73 of that going towards transaction fees.

The context of this is that there are currently 10 maintained projects, comprising 23,706 lines of code, of which the projects receive an average of 8,000 downloads per month. There has been a little over 690k downloads to date and of the projects that access rfxn.com servers regularly we can derive that there is currently at least 24,629 unique IP addresses (servers) running one or more projects (only 3 projects regularly access data on rfxn.com post-install, so that figure is an order of magnitude larger relative to total downloads).

I have recently added a donation roll page that lists every donation to rfxn.com projects to date and a widget on all pages that displays the 5 most recent donations. This is an effort to further acknowledge the minority of individuals that contribute financially to the projects and to be completely transparent about donations. As the above clearly shows, the donations are not a means for survival of the projects let alone a means for me personally to survive, the projects are developed in my spare time between my full-time job and life. This spare-time nature to maintaining the projects has over the years resulted in a fair share of less-than-constructive comments about the projects and frequency of updates, although some of these comments do have merit, in general the projects are mature, stable and have stood the test of time to prove they are relevant just as much today as they were yesterday.

I hope this post has helped to better illustrate the personal commitment I have towards the projects, what (financial) incentives exist surrounding the projects and to provide a clear and transparent account of donations.

2 Comments more...

LMD: One Year Later

by on Nov.08, 2010, under Development, My Blog

With my move back to Canada behind me and adjusting to some new routines with life, its about time to get back into the mix with the projects. Though things have been slow the last couple of months, it has not stopped me from making sure regular and prompt malware updates are released.

Today, we reflect on the first year of Linux Malware Detect, which was released in a very infantile beta release about a year ago. The project has evolved in allot of ways from its original goals, it has certainly changed in every way for the better. What was originally to be a closed project, relegated to mostly internal work related needs, ended up like most of my projects morphing into a public release. The first release saw the world with less than 200 signatures, no reliable signature update method, manual upgrade options and very flawed scanning and detection methods (v 0.7<). Now, we sit at version 1.3.6, with 4,813 signatures, a scanning method that though still needs some work, is far superior than what was originally in place, a detection routine based on solid md5 hashes and hex signatures. We have cleaner rules that can clean some nasty injected malware, we got a fully functional quarantine and restore system, reporting system, real-time file based monitoring, integrated signature updater and version updater and a vibrant community of users that regularly submit malware for review. Yes, LMD has grown up! The most grown-up part of LMD has to be how signatures are handled and how the processing of them is almost an entirely automated process now, this was detailed a little more in Signature Updates & Threat Database posted in September. The key part here though, is “almost entirely automated”, everyday that the processing scripts run to bring in new malware, there is always a number of files that cant be processed automatically and these are moved to a manual review queue. With how busy life has been the last couple of months, the review queue has slowly risen to 1,097 files pending review. This queue is at the top of my list for tackling over the next couple of days and weeks, its allot of work to review that much malware but it will get done. Many of the files to review are actual user-submissions so if you did submit something and find its still not detected by LMD, this would be why :).

There is still allot on the to-do list for LMD going forward, with the upcoming release of version 2.0 we will see some changes in how LMD does business. The first and to me the biggest will be optional usage statistics, which will allow users to have LMD report anonymized statistics back to rfxn.com. These statistics will show us which malware hits are found on your servers, which in turn contributes towards better focus on what type of malware threats are prioritized in the daily processing queue for hashing & review. The statistics will also help create informative profiles on the soon-to-be-released dailythreats.org web site about how maldet is used and what are the most prevalent threats in the wild.

Other additions to LMD 2.0 will be a refined scanner that will provide greater speed with large file sets (50k – 1M+ files), an ability to fork scans to the background, better and more predictable logging format for 3rd party processing of LMD log data, redesigned reporting system, full BSD support, ability to create custom signatures from the LMD command line, expanded cleaner rules, wildcard support for exclude paths, a number of security and bug refinements and as always, more signatures.

If you have any feature requests for LMD 2.0, go ahead and post them as a comment and I will make sure they get added to the list. Thank you to everyone who continues to support rfxn.com projects through donations, feedback and by just using & spreading the word about the projects. I look forward to another year of LMD and seeing it become the premier malware detection tool for Linux and all Unix variant OS’s.

7 Comments :, , , more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...