Skip to main content
rfxn
maldetbashmalwaredetectionarchitecture

Linux Malware Detect 2.x: From Zero to Protected in 28 Seconds

Ryan MacDonald60 min talk34 slides
Slide Deck

Self-contained HTML deck. Arrow keys, click, or swipe to navigate. A theme toggle lives in the top-right corner of the deck.

Abstract

Linux Malware Detect started in 2005 as a 500-line shell script that scanned shared hosting infrastructure for PHP webshells. Twenty years later it is on roughly 348,000 hosts, is a required LPIC-3 competency, ships in Gentoo Portage and the AUR, and still fits in a tarball you can read in an afternoon. The 2.x rewrite was a chance to keep the auditability and lose almost everything else.

This deep dive walks the architecture end to end: why the new engine is written in pure bash, what the seven detection stages actually do, how the compound signature language works, and where the 43x speedup over 1.6.6 comes from. It also answers the question that keeps coming back: if you are willing to use ClamAV, why would you ever use this instead? The short answer is 22x less memory, 2.1x higher detection on the sample set we care about, and the ability to run on a 1 GB VPS without the OOM killer firing.

The long answer is the talk.

Key Takeaways

  • Bash is a bad language and an excellent orchestrator. The native toolchain (grep -F, awk, xargs -P, od, sha256sum) is remarkably fast once you stop fighting it.
  • The 2.x engine scans 10,000 files in 28 seconds with a 44 MB memory footprint and zero external dependencies. ClamAV uses 998 MB for the same work.
  • Seven detection stages, not one. Hash match, hex pattern, compound signatures, ClamAV (optional), archive inspection, heuristics, and the learning queue are each optimised for a different class of payload.
  • Signatures come from real network-edge intrusion telemetry on a 6-hour update cycle. The sample set is production traffic, not a lab corpus.
  • Compound signatures are boolean detection in shell. AND/OR/threshold rules over grep, awk, and sort, with 2.1x higher detection than ClamAV on webshells and skimmers.
  • Portable bash is a 20-year problem. CentOS 6 has coreutils at /bin/, Rocky 9 has them at /usr/bin/. The 2.x engine runs on both without a conditional.

Related