R-fx Networks

Linux Malware Detectection

by on Oct.19, 2009, under Development

[ UPDATE: Linux Malware Detect has been released ]
I have the last few weeks been working on a new project for malware detection on Linux web servers, it is already at a pre-release version in use at work and it has shown phenomenal promise.

Right to it, some background… On a daily basis the network I manage receives a large number of attacks, most of these are web based abuses against common web application vulnerabilities which inject/upload to servers an array of malware such as phishing content, defacement tools, exploits for privilege escalation and irc c&c bots. All these actions are typically logged and recorded by our network edge snort setup which got me to thinking if we started to catalog some of the injected malware, I could hash it and then detect it on servers.

Now, some might be thinking – “network edge IDS? why not convert it to IPS and stop the attacks right away?” – though this is something I am actually in the process of doing, there is a much larger problem and that is content encoding. Allot of malware attacks are coming in these days in base64 and gzip encoded data payloads which snort or any other IDS/IPS products for that matter are currently NOT capable of decoding without use of fancy transparent proxy setups that are out of the scope of standard network edge intrusion detection/prevention.

So, this brings us to a host based solution for malware detection which as it turns out is not so easy as there is no simple sites that actually track malware specifically targeting web applications and the ones that do exist focus primarily on Windows based malware; utterly useless. To address this short coming, what I have done is essentially written a set of tools that extracts from specific ids events the payload data of attacks (decodes if needed) and saves/downloads the content attackers are trying to inject. This data is then processed for false positives by me every couple of days followed by the creation of md5 hashed definitions of the malware for the detection tool. The hashes are compiled in two methods, the first is straight md5 hashes of the data and the second are hashes of “chunked” elements of the data in specific increments and formats as so to detect commonly occurring malware code in otherwise unique files and content types.

The scanner portion of the malware detection tool comes in 3 varieties, the first is a standard “scan all” feature which scans an entire defined path, the second is a “scan recent” feature that can scan a path for content created in the last X days (i.e: /home/*/public_html content created in the last 7 days) and the third is a real time monitoring service component that uses Linux inotify() kernel feature to detect real time file create/move/modify operations and scan content immediately as it is created under user web paths (default /home[2]/user/public_html).

The malware hit management is a very simple anti-virus like quarantine system that moves offending files to ‘INSTALL_PATH/quarantine/’ and logs the exact source path and destination file name in quarantine locker in case you need to restore any data due to false positives (though this should never happen since we are using hashed detection). In addition, the quarantine function can optionally search the process table for running tasks that contain the file name of the offending malware and kicks off a kill -9 against it.

The event management is handled in two ways, for manual user invoked scans from cron/command line, emails are directly dispatched with the scan results including quarantine details – nothing really fancy here. The monitor component that uses inotify() on the other hand, has the potential to generate allot of quarantine events in rapid succession so a standard email out on every hit isn’t appropriate. Instead, we have a daily cron job that runs an internal option in the malware detect tool to read ONLY new lines from a quarantine hit list and dispatch a daily event summary if any quarantine hits are found. Since we are only reading new lines from the hit list, we avoid repetitive daily alerts for events we already know about and retain the hit list as an “all-time” hit list that can later be used to derive trending data / phone home features for global trending.

Finally, the project also contains an internal update function to check for new hashes and runs in the daily cron task in addition to a simple check feature that determines if inotify() based monitoring is running, if it is not then it kicks off against /home[2]/user/public_html a scan for content created in the last 48h.

:, , ,

8 Comments for this entry

  • cheap web design

    Great blog thank you for sharing.

  • Mayank Bhatnagar


  • Mayank Bhatnagar

    hi Ryan,
    I undrstand the scanning of commonly occuring malware code md5 hashes with the data received, however, why you do not try to make it almost real time at a proxy level….understanding content encoded traffic, various HTTP content headers will be carried out at a proxy sitting in between at the network edge and you can apply your logic at this location…..why not try this….

  • Tony Kammerer

    Sounds awesome Ryan, you know I am down with helping to test this!
    Very glad to see you cranking out wonderful stuff again!

  • Max Rathbone

    Ryan, I’d be willing to test this with you as well. Our corporate servers number above 100, and we have multiple datacenters so we may provide a good test bed. thanks

  • Ryan M.

    Open testing for the project will go out later this or next week.

  • Ricardo J. Barberis

    Hi, being an admin for a web hosting company this is something that can be really usefull.

    I’ve actually been thinking to implement some kind of scanner (probably manual scan with clamav + some home made scripts, nothing so complete like you’re pursuing) since we’re facing the same kind of problems/abuses as you describe.

    I’ll be glad to beta test this tool whenever you have it, even if not production ready.

    I can also send you some common scripts uploaded to our servers, as I have collected many of them through all these years 🙂

  • Peter Abraham

    Ryan, when will this project be open for any beta testers?

    I would be willing to test it out on CentOS 3, 4, and 5 servers; we also have some RedHat Enterprise 3 and 4 servers.

    Thank you.

3 Trackbacks / Pingbacks for this entry

Leave a Reply

Looking for something?

Use the form below to search the site:

Site Links

A few links to navigate our site quicker...