Understanding Signatures

The signature naming scheme for LMD is a little confusing and something I’ve received more than a few questions about, more so about what the *.unclassed signatures mean. The naming scheme (to me) is straight forward and breaks down as follows:

{SIG_FORMAT}lang/vector.type.name.ID#

The ‘SIG_FORMAT’ is either HEX or MD5 reflecting the internal format of the signature, the ‘lang/vector’ is the language or attack vector of the malware, ‘type’ is a short descriptive field for what the malware does (i.e: ircbot, mailer, injection etc…), ‘name’ is a short descriptive name unique to the piece of malware and ‘ID#’ is the internal signature ID number.

What some people appear confused about is signatures such as ‘{HEX}base64.inject.unclassed.7’ that use the term “unclassed” for the name field. Essentially, signatures that are unclassed represent a group of malware that is not necessarily unique from each other but that follows the same attack vector, such as base64 encoded scripts; there are hundreds of these scripts and in encoded form it doesn’t really matter what they do, we are detecting the encoded format not the decoded, so they get lumped together. In other instances, I will throw some malware into an unclassed group when it is very new and I have not had time yet for processing it into its own classification, for example the web.malware.unclassed is a dumping ground for allot of malware that is newly submitted, which I have reviewed and confirmed IS MALWARE but have not yet classified it or determined if it is a variant of an existing malware classification.

It needs to be understood that the processing of malware is mostly a manual task, though there are some elements of it that are automated, the actual review of each malware file is done by hand to remove the chance of false positives — keeping LMD accurate and reliable. As such, not all malware makes it into a classification group right away, the important part is that malware is reviewed, verified and signatures generated for it in a timely fashion. I process malware daily from the network edge IPS system at work, from user submitted files and from various malware news groups / web sites and the priority is getting the signatures up for in the wild threats. The signature name/classification serves informative purposes, yes it is important but not as important as the actual verification and signature generation.