Analyzing Windows Defender's sigantures for fun

The database of Windows Defender can be extracted with WDExtract. A deep analysis of the db structure is at experiments/windows-defender/VDM at master · commial/experiments · GitHub

A signature can be parse with a struct like this

struct {
    uint8_t sig_type;
    uint8_t size_low;
    uint16_t size_high;
    uint8_t value[size_low | size_high << 8];
} sig_entry;

The length of value is changed every time database controller reads a new signature.
To make it be easier to analysis, I wrote a simple script to parse and print information of each signature in the db (tested with AV sigs only, the AS (anti spyware) wasn’t tested

An example after parsing the whole db:

===================Threat BEGIN===================
SigType: SIGNATURE_TYPE_THREAT_BEGIN
s_low: 32 s_high: 0 len: 32
Value: @["c", "\\xAF", "\\x02", "\\x80", "\\x00", "\\x00", "\\x01", "\\x00", "\\x06", "\\x00", "\\x0A", "\\x00", "\\x84", "!Wkysol.J", "\\x00", "\\x00", "\\x01", "@", "\\x05", "\\x82", "B", "\\x00", "\\x04", "\\x00"]

SigType: SIGNATURE_TYPE_PEHSTR_EXT
s_low: 192 s_high: 0 len: 192
Value: @["\\x04", "\\x00", "\\x04", "\\x00", "\\x06", "\\x00", "\\x00", "\\x01", "\\x00", "-", "\\x00", "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run", "\\x01", "\\x00", "\\x0C", "\\x01", "/put.asp?nm=", "\\x01", "\\x00", "\\x15", "\\x01", "/get.asp?nm=index.dat", "\\x01", "\\x00", "\\x1C", "\\x01", "Rj", "\\x00", "j", "\\x00", "j", "\\x00", "j", "\\x00", "j", "\\x00", "j", "\\x00", "h ", "\\x02", "\\x00", "\\x00", "j j", "\\x02", "\\x8D", "E", "\\xDC", "P", "\\xFF", "\\x15", "\\x01", "\\x00", "\\x1C", "\\x03", "\\x80", "\\xC9", "\\x80", "\\x89", "\\x8D", "\\x90", "\\x01", "\\x04", "j", "\\x04", "\\x8D", "\\x95", "\\x90", "\\x1B", "\\x00", "Rj", "\\x1F", "\\x8B", "\\x85", "\\x90", "\\x01", "\\x04", "P", "\\xFF", "\\x15", "\\x90", "\\x00", "\\x01", "\\x00", "\\x19", "\\x01", "\\x8D", "U", "\\xA8", "RSSSSSSh ", "\\x02", "\\x00", "\\x00", "j j", "\\x02", "\\x8D", "E", "\\xDC", "P", "\\xFF", "\\x15", "\\x00", "\\x00"]

SigType: SIGNATURE_TYPE_THREAT_END
s_low: 4 s_high: 0 len: 4
Value: @["c", "\\xAF", "\\x02", "\\x80"]
-------------------Threat END---------------------
  • Malware name: !Wkysol.J
  • Detection method: Find strings in PE file (SIGNATURE_TYPE_PEHSTR_EXT)

Test the fake “malicious” binary against the scanner[s] (i used VirusTotal).
Create a simple Nim file

$cat test.nim
const
  s1 = "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run"
  s2 = "/put.asp?nm="
  s3 = "/get.asp?nm=index.dat"
  s4 = "RSSSSSSh"

Compile PE file: nim c -d:mingw test.nim
Result: VirusTotal

The result is interesting: Windows defender didn’t detect this fake binary as a malicious file. Because the conditions are compiled, so we don’t really know the actual conditions that WinDef uses to match a binary (could be specific ranges, offset or anything else). However, we can see some machine learning engines are having false positives.