Signature Updates & Threat Database

It has been a very active month for those that pay attention to the signatures as they are released, you might have noticed a sudden spike about two weeks ago in signatures from 2,500’ish to the now 4,425 mark. The vast majority of these signatures were put up in MD5 format as a great many are variants of “known” malware and were extracted through processing historical threat data for the last 90 days, sorted by unique hashes, from clean-mx.de. I also did some leg work in my processing scripts which has allowed them to handle base64 and gzip decoding of POST payloads from IPS data which is generating a marked increase in new malware and known malware variants. Together, this has added 1806 MD5 and 31 HEX signatures in the last 45 days bringing us to the current mark of 4425 (2808 MD5 / 1617 HEX) total signatures.

In addition to the above, the daily processing scripts have been rewritten and combined into a single task on the processing server, this has brought together what was previously 9 different scripts into a single, streamlined and much more efficient task. The reason that things got to the point where there was 9 different scripts to update various elements of the back end processing server is that the LMD project developed very fluidly over the last year, meaning that every time I had a new idea or added a new feature, I in turn created a new script to support the idea/feature — over time this naturally was not sustainable and now what we have is exactly that — sustainable.

For those interested, here is the output report generated and sent to my inbox at the end of each daily malware update task:

started daily malware update tasks at 2010-09-13 00:09:35
running daily malware fetch... finished in 710s
running daily ftp malware fetch... finished in 6s
regenerating signatures from daily malware HEX hits... finished in 95s
propagating signature files... finished in 2s
generating sqlfeed data... finished in 88s
running mysql inserts for sqlfeed on praxis... finished in 42s
syncing & updating malware source data (master-urls.dat).... finished in 27s
syncing & updating irc c&c nets... finished in 15s
rebuilding maldetect-current... finished in 3s
pushing maldetect-current and signatures to web... finished in 4s
completed daily malware update tasks at 2010-09-13 00:26:05 (990s)
processed 156 malware url's
retrieved 40 malware files
extracted and hashed 16 new signatures
extracted 59 new irc c&c networks
queued 24 unknown files for review

An important part to streamlining the daily update tasks was also in rewriting some of the basic processing scripts to better log and store information on malware sources, such information includes date, source url, file md5, sig name, top level domain, online state, ip, asn, netowner and more. All malware is also now processed through an IRC extraction script that checks for irc server details in malware files and adds it to a irc command & control list with details such as date, source file md5, source file sig name, irc server, irc port, irc chan, online state, ip, asn, netowner and more. The “online state” fields in both the malware source and IRC c&c databases perform active checks, for the malware source this is simply verifies a URL is still active and/or domain still resolves, for the IRC c&c database this is a bot that manually connects to the irc network and verifies the network and channels are online & populated. All irc users, host masks and a sampling period of channel activity is also recorded from each active IRC c&c network, this information at this time is not included in the database as allot of it requires sanitizing as many IRC c&c networks dont mask connecting hosts and the channel activity reveals exceedingly sensitive information about actively vulnerable web sites and servers, this is something I am working on adding but its a difficult task so it will take some time. The malware signatures database has also been populated but requires a little more work, mainly adding meta data to describe each signature in a format that is longer than the single-word descriptions included in the signature naming scheme.

Together, the malware signature database, the malware source database and the IRC C&C networks database will all tie together into a single threat portal to be released in the next couple of weeks (I hope) allowing correlation between data in all 3 databases seamlessly. For example one could query all malware sourced from a specific IP, ASN or Netowner or you could find all the source URL’s for a specific malwares MD5 signature, or you could query the signature database to find more information on a specific signature, etc… there are a great many options that will be available for reviewing, cross referencing and exporting data from the databases.

These databases are all already completed, active and receiving updates, all that is left for me to do is create the front end that will find its home on http://www.dailythreats.com. The signature database, as expected, has 4,526 entries, the malware source database has 7,859 entries and the IRC C&C database has 386 entries. There is currently 511 files pending review in the malware queue, there has been 3,592 malware files reviewed in the last 45 days, of those 1,806 were unique files and the 511 files in queue for review represent files that could not be auto-hashed against a known threat or variant threat from HEX pattern matches.

The biggest pitfall of all these changes has been the explosion in the review queue that I must tend with daily, it has started to back up on me as I am in the middle of moving from Michigan to Montreal but as soon as I am done with my move in a couple of weeks, I plan to get that queue under control and work on some more back end scripts to help streamline its processing slightly.

Well that’s it for now, keep an eye out for details to come on the dailythreats.com site, its going to be exciting 🙂

Tracking & Killing Bot Networks

In a previous blog I discussed how one of the more enjoyable parts of my day-to-day malware rituals also involves the tracking and killing of command and control bot networks. Recently I have begun automating this process a bit; I have created a series of scripts that extract irc servers, port numbers and channels from malware as it comes in and then checks if the irc server is still online, a custom bot then logs into the server, queries the active channels and determines how many zombies are active on the network. If an irc server is determined to be active with zombies actively connected, the server is then reported to the abuse address listed in the whois information for the servers IP Address.

The automation of this process is something I have had on my todo list for a little while but finally stopped procrastinating it and got it done. The real advantage of it being automated now is I can easily generate a tangible set of information that allows for me to see how many bot networks are present in the malware I process daily, weekly and monthly, how many of those networks are still active and more importantly how many of those networks have active zombies still connected. Likewise, as I’ve discussed previously, I am working on a threat portal and having the irc c&c data processing automated will more easily allow me to put that information on the threat portal and integrate it into the aggregate threat feed that the portal will offer for route/firewall/DNSBL drops.

Here are some statistics on IRC command and control networks as seen in the malware processed by me in the last 30 days:
Total Processed Malware (30d): 607
Total IRC C&C Servers: 251
Total Online IRC C&C Servers (as of 08/17/10): 118
Total Online IRC C&C Servers with Active Zombie Hosts: 30
Total Zombies Observed on Online IRC C&C Servers: 1,679 (55 average per server)

There are some notable observations, out of the total of 251 noted IRC C&C servers, only 118 of them are still online, of those 118 that are still active, 64 of them utilize free DNS naming services and/or dynamic dns services, the other 54 create C&C channels on established public IRC networks or use the DNS name of compromised hosts running an IRC server. Most every one of the 133 now inactive IRC servers used IP addresses within the host malware script, a small majority used DNS names of compromised hosts.

It goes without saying that by using public DNS services / dynamic DNS services, it allows attackers the flexibility to quickly recover a C&C server and its participating zombies in the event of the host server being shutdown. Further, a number of more mature IRC C&C bots will continue reconnection attempts periodically when disconnected from the host C&C server, further increasing the chance of fully recovering the zombie network for the attacker.

Also increasingly, PHP is becoming more common as a language of choice for C&C bot agents, though Perl agents are still vastly more popular. The LMD project currently has classified 44 unique C&C bot agents comprising 286 agent scripts/binaries, 14 classes or 38 scripts of which are PHP based and 21 classes or 213 scripts of which are Perl based, 9 classes or 35 scripts/binaries being Other (c/ruby/java).

Currently there is an average of 6 bot networks being abuse reported per day, of those only about 2-3 per day ever receive any form of followup and/or shutdown of the host running the network. That is a rate of less than 50% on average, which is abysmal to say the least. When the threat management portal goes up in the coming weeks, these networks will find themselves at the top of the threat feed and planted squarely on the front page of the portal — we might not be able to shut them down but we sure can filter them off our networks.

Signature Updates: Month In Review

Since I will be busy this coming week with other priorities, I am posting an early month in review blog on signature updates.

In the last 3 weeks we have not seen a whole lot of action on in-the-wild malware, most of what is propagating at the moment are variants of already detected content. That is however not to say there has not been new signatures extracted, allot of this months signatures have come from account level compromises on vulnerable e107, wordpress and joomla installations along with user submissions. There is not a whole lot of ground breaking malware threats, it is more of the usual such as mass mailers, perl/php command shells, irc bots and php socket flooding tools.

In total, the 3 weeks ending Sat July 24th, there has been 128 new signatures in 54 classifications with 65 signatures being added in the last 7 days. This brings us to a total of 2,588 (1002 MD5 / 1586 HEX) signatures, an increase of 117 signatures over the last blog post on signature updates. For those paying attention, there is a discrepancy of -11 signatures between the 128 new signatures and the +117 change since the last update, this is because there has also been 11 signatures removed for poor performance/false positives.

As always new signatures are automatically updated daily or can be manually updated with the -u|–update command line options. The 128 new signatures fall into the following classification groups:

base64.inject.unclassed    exp.linux.unclassed
perl.cmdshell.n0va         perl.ircbot.Arabhack
perl.ircbot.BaMbY          perl.ircbot.devil
perl.ircbot.fx29           perl.ircbot.genol
perl.ircbot.karawan        perl.ircbot.oldwolf
perl.ircbot.plasa          perl.ircbot.putr4XtReme
perl.ircbot.rafflesia      perl.ircbot.UberCracker
perl.md5browser.avi        perl.shell.cgitelnet
php.cmdshell.antichat      php.cmdshell.avi
php.cmdshell.aZRaiL        php.cmdshell.c100
php.cmdshell.DxShell       php.cmdshell.h4ntu
php.cmdshell.hackru        php.cmdshell.KAdot
php.cmdshell.lama          php.cmdshell.Macker
php.cmdshell.mic22         php.cmdshell.myshell
php.cmdshell.NCC           php.cmdshell.r3v3ng4ns
php.cmdshell.r57           php.cmdshell.s72
php.cmdshell.Safe0ver      php.cmdshell.SimShell
php.cmdshell.SRCrew        php.cmdshell.Storm7
php.cmdshell.unclassed     php.cmdshell.winx
php.cmdshell.wls           php.cmdshell.xakep
php.cmdshell.ZaCo          php.cpcrack.Aria
php.exe.globals            php.include.remote
php.ircbot.NewLive         php.mailer.DALLAS
php.mailer.unclassed       php.mailer.YoUngEST
php.nested.base64          php.pktflood.unclassed
php.rshell.0wned           web.malware.unclassed

Signatures For The Masses

Today I found the time and energy, despite how tedious it was, to go over the last two weeks worth of malware submissions and missed edge IPS data from when I was away. This resulted in a total of 126 new signatures (67 MD5 / 59 HEX) which brings LMD to a total of 2,471 signatures (894 MD5 / 1577 HEX). This now also gives the project a unique distinction among anti-virus and malware detection offerings, as the single largest project, commercial or open source, detecting Linux malware.

To further illustrate the lapse in coverage by other vendors, we can turn to CYMRU analysis of the MD5 hashes in LMD, as discussed on the LMD home page, CRYMRU provides malware data to vendors such as trendmicro, symantec, kaspersky, microsoft, google and more.

KNOWN MALWARE:       301
 % AV DETECT (AVG):  57
 % AV DETECT (LOW):  58
 % AV DETECT (HIGH): 71
 UNKNOWN MALWARE:    593

This in short shows that of all the vendors that CYMRU provides data for, only 301 of LMD’s 894 MD5 signatures are detected by competing solutions and of those threats detected, on average, only 57% of vendors detect each threat. This information really has no other significance than to reinforce the validity of this project and the time I am investing into it, chalk one up for stroking own ego!

New signatures in this update are classified into the following groups, you will notice ALLOT of command shells in this update, including an interesting addition, a JSP command shell!

base64.inject.unclassed     exp.linux.unclassed
jsp.cmdshell.zerocnbct      perl.cmdshell.n0va
perl.ircbot.Arabhack        perl.ircbot.BaMbY
perl.ircbot.devil           perl.ircbot.genol
perl.ircbot.karawan         perl.ircbot.rafflesia
perl.ircbot.UberCracker     perl.md5browser.avi
php.cmdshell.antichat       php.cmdshell.avi
php.cmdshell.aZRaiL         php.cmdshell.DxShell
php.cmdshell.h4ntu          php.cmdshell.hackru
php.cmdshell.KAdot          php.cmdshell.lama
php.cmdshell.Macker         php.cmdshell.myshell
php.cmdshell.NCC            php.cmdshell.r3v3ng4ns
php.cmdshell.s72            php.cmdshell.Safe0ver
php.cmdshell.SimShell       php.cmdshell.SRCrew
php.cmdshell.unclassed      php.cmdshell.winx
php.cmdshell.wls            php.cmdshell.xakep
php.cmdshell.ZaCo           php.include.remote
php.mailer.DALLAS           php.rshell.0wned

I am Back: Signature Updates

I am back, fresh off a trip home to Montreal, which I must say was an absolutely amazing time. It has left me reflecting on a lot of things, most importantly that there really is no place like home — I miss Montreal more than I can even describe. That said though, time to get back into the mix of things — there is a mountain of malware submissions to review, 91 to be exact. Today I really could not find the energy or time to go through them all but I did process the edge IPS data to extract some in the wild signature data which generated 8 new signatures that are now live. In the coming days, I will work through the malware submissions and get those signatures out as soon as possible.