Files
SKELETONKEY/docs/DETECTION_PLAYBOOK.md
T
leviathan ee3e7dd9a7 skeletonkey: --explain MODULE — single-page operator briefing
One command that answers 'should we worry about this CVE here,
what would patch it, and what would the SOC see if someone tried
it'. Renders, for the specified module:

  - Header: name + CVE + summary
  - WEAKNESS: CWE id and MITRE ATT&CK technique (from CVE metadata)
  - THREAT INTEL: CISA KEV status (with date_added if listed) and
    the upstream-curated kernel_range
  - HOST FINGERPRINT: kernel + arch + distro from ctx->host plus
    every relevant capability gate (userns / apparmor / selinux /
    lockdown)
  - DETECT() TRACE (live): runs the module's detect() with verbose
    stderr enabled so the operator sees the gates fire in real
    time — 'kernel X is patched', 'userns blocked by AppArmor',
    'no readable setuid binary', etc.
  - VERDICT: the result_t with a one-line operator interpretation
    that varies by outcome (OK / VULNERABLE / PRECOND_FAIL /
    TEST_ERROR each get their own framing)
  - OPSEC FOOTPRINT: word-wrapped .opsec_notes paragraph (from
    last commit) showing what an exploit would leave behind on
    this host
  - DETECTION COVERAGE: which of auditd/sigma/yara/falco have
    embedded rules for this module, with pointers to the
    --module-info / --detect-rules commands that dump the bodies

Targeted at every audience the project is meant to serve:
  - Red team: opsec footprint + 'would this even reach' verdict
    in one screen
  - Blue team: paste-ready triage ticket with CVE / CWE / ATT&CK /
    KEV header and detection-coverage matrix
  - Researchers: the live trace shows the reasoning chain
    (predates check, kernel_range_is_patched lookup, userns gate)
    that drove the verdict — auditable without reading source
  - SOC analysts / students: a single self-contained briefing per
    CVE, no cross-referencing needed

Implementation:
  - New mode MODE_EXPLAIN, new flag --explain MODULE
  - cmd_explain() composes the page from the existing module
    struct, cve_metadata_lookup() (federal-source triage data),
    ctx->host (cached fingerprint), and a live detect() call
  - print_wrapped() helper word-wraps the long .opsec_notes
    paragraph at 76 cols / 2-space indent
  - Help text + README quickstart + DETECTION_PLAYBOOK single-host
    recipe all updated to mention --explain

Smoke tests:
  - macOS: --explain nf_tables shows full briefing; trace says
    'Linux-only module — not applicable here'; verdict
    PRECOND_FAIL with the generic-precondition interpretation
  - Linux (docker gcc:latest): --explain nf_tables on a 6.12 host
    fires '[+] nf_tables: kernel 6.12.76-linuxkit is patched';
    verdict OK with the 'this host is patched' interpretation
  - Both: --explain nope (unknown module) returns 1 with a clear
    'no module ... Try --list' error
  - Both: 87 tests still pass (33 kernel_range + 54 detect on Linux,
    33 + 0 stubbed on macOS)

Closes the metadata + opsec + explain trio. The three together
answer the 'best tool for red team, blue team, researchers, and
more' framing.
2026-05-23 10:49:46 -04:00

468 lines
17 KiB
Markdown

# SKELETONKEY detection playbook
Operational guide for blue teams using SKELETONKEY defensively. Pairs
with `docs/DEFENDERS.md` (the "what" reference) — this is the "how to
make it part of your daily ops" guide.
## The lifecycle
```
┌─────────────┐
│ inventory │ ← skeletonkey --list (what's bundled?)
└──────┬──────┘
┌─────────────┐
│ scan │ ← skeletonkey --scan --json (what am I vulnerable to?)
└──────┬──────┘
┌─────────────┐
│ fleet scan │ ← skeletonkey-fleet-scan.sh hosts.txt
└──────┬──────┘
┌────────────┼────────────┐
▼ ▼ ▼
┌────────┐ ┌─────────┐ ┌──────────┐
│ deploy │ │ mitigate│ │ upgrade │ ← three responses
│ rules │ │ (pre-fix│ │ (kernel │
│(SIEM) │ │ stopgap)│ │ patch) │
└────┬───┘ └─────┬───┘ └─────┬────┘
└────────────┼────────────┘
┌─────────────┐
│ monitor │ ← ausearch -k skeletonkey-* / SIEM alerts
└─────────────┘
```
## Recipes by team size
### Single host (workstation / single server)
```bash
# Daily/weekly hygiene check
sudo skeletonkey --scan
# Investigate a specific finding (one-page operator briefing)
sudo skeletonkey --explain nf_tables # whichever module came back VULNERABLE
# Shows: CVE / CWE / MITRE ATT&CK / CISA KEV status, live detect() trace,
# OPSEC footprint (what an exploit would leave behind), detection-rule
# coverage, mitigation. Paste into the triage ticket.
# If anything's VULNERABLE, deploy detections + apply mitigation
sudo skeletonkey --detect-rules --format=auditd | sudo tee /etc/audit/rules.d/99-skeletonkey.rules
sudo augenrules --load
sudo skeletonkey --mitigate copy_fail # or whichever module fired
```
The `--explain` output is also useful as a learning artifact: each
module's `--explain` block is a self-contained CVE briefing with the
reasoning chain the detect() function walked, so analysts can verify
SKELETONKEY's verdict against their own understanding of the bug.
### Small fleet (~10-100 hosts, SSH-reachable)
Use `tools/skeletonkey-fleet-scan.sh`:
```bash
# Hosts list — one per line; user@host:port supported
cat > hosts.txt <<EOF
prod-web-01
prod-web-02
deploy@bastion-01
ops@db-01:2222
EOF
# Scan; binary scp'd, run, cleaned up. Output is one JSON doc.
./skeletonkey-fleet-scan.sh \
--binary ./skeletonkey \
--ssh-key ~/.ssh/ops_key \
--parallel 8 \
hosts.txt > fleet-scan-$(date +%F).json
# Show me hosts with any VULNERABLE finding
jq '.hosts[] | select(.scan.modules | map(.result == "VULNERABLE") | any) | .host' \
fleet-scan-*.json
# Show summary across the fleet
jq '.summary' fleet-scan-*.json
```
Output shape:
```json
{
"generated_at": "2026-05-16T22:00:00Z",
"n_hosts": 4,
"summary": {
"ok": 4,
"failed": 0,
"vulnerable": [
{ "cve": "CVE-2024-1086", "name": "nf_tables", "count": 2 },
{ "cve": "CVE-2023-0458", "name": "entrybleed", "count": 4 }
]
},
"hosts": [...]
}
```
### Larger fleet (>100 hosts)
`skeletonkey-fleet-scan.sh` is intentionally simple (parallel ssh). For
fleets too large for SSH-fan-out, wrap it in your config-management
tool of choice:
- **Ansible**: ship the binary via `copy:`, run via `command:`, parse
JSON with `jq` in a follow-on task
- **SaltStack**: `cmd.run` returning JSON; `salt-call --return` to your
SIEM
- **Fabric / Mitogen**: same shape, just Python-side
Sample Ansible task:
```yaml
- name: scan with skeletonkey
copy:
src: skeletonkey
dest: /tmp/skeletonkey
mode: '0755'
- name: run --scan --json
command: /tmp/skeletonkey --scan --json --no-color
register: scan
changed_when: false
failed_when: false # skeletonkey exit codes are semantic, not errors
- name: collect
set_fact:
skeletonkey_scan: "{{ scan.stdout | from_json }}"
- name: cleanup
file:
path: /tmp/skeletonkey
state: absent
```
## SIEM integration patterns
### Splunk
```
# splunk input config (inputs.conf)
[script:///opt/skeletonkey/skeletonkey-cron-scan.sh]
interval = 86400
source = skeletonkey
sourcetype = skeletonkey:scan
```
`skeletonkey-cron-scan.sh`:
```bash
#!/bin/bash
/usr/local/bin/skeletonkey --scan --json --no-color
```
Search the indexed events:
```spl
index=skeletonkey sourcetype="skeletonkey:scan" modules{}.result=VULNERABLE
| stats count by host modules{}.cve
```
### Elastic / OpenSearch
Filebeat module reading the per-host scan JSON files (one per day),
indexed into an `skeletonkey-*` index pattern. Standard Kibana
visualization on `modules.cve` over time tracks vulnerability lifecycle.
### Sigma → your platform
```bash
# Ship Sigma rules into your platform
skeletonkey --detect-rules --format=sigma > /etc/sigma/skeletonkey.yml
# Convert to your target (Sentinel, Elastic, etc.) via sigmac
sigmac -t elastic /etc/sigma/skeletonkey.yml
```
### YARA artifact scanning
YARA rules catch the **post-fire** state — page-cache shellcode
overwrites, malicious `.deb` drops, `/etc/passwd` UID flips. Run them
as a scheduled scan against sensitive paths:
```bash
# Ship YARA rules
sudo skeletonkey --detect-rules --format=yara | sudo tee /etc/yara/skeletonkey.yar
# Scheduled scan via cron — catches the page-cache and /tmp artifacts
# /etc/cron.d/skeletonkey-yara
*/15 * * * * root yara -r /etc/yara/skeletonkey.yar \
/etc/passwd /tmp /usr/bin/su /usr/bin/passwd \
2>>/var/log/skeletonkey-yara.log
```
What each rule catches:
| Rule | Triggers on |
|---|---|
| `etc_passwd_uid_flip` | Non-root user line in `/etc/passwd` with a zero-padded UID (`0000+`). Canonical Copy Fail / Dirty Frag / Dirty Pipe / DirtyDecrypt outcome. |
| `etc_passwd_root_no_password` | `root` line with empty password field — DirtyDecrypt's intermediate corruption step. |
| `pwnkit_gconv_modules_cache` | Small `gconv-modules` text file with a `module UTF-8// X// /tmp/…` redefinition. |
| `dirty_pipe_passwd_uid_flip` | Same UID-flip pattern (Dirty Pipe-specific tag). |
| `dirtydecrypt_payload_overlay` | First 28 bytes of `/usr/bin/su` (or similar) match the embedded 120-byte ET_DYN shellcode the V12 PoC overlays. |
| `fragnesia_payload_overlay` | Same shape for the 192-byte Fragnesia payload. |
| `pack2theroot_malicious_deb` | `.deb` ar-archive in `/tmp` with the SUID-bash postinst. |
| `pack2theroot_suid_bash_drop` | `/tmp/.suid_bash` exists and is a real bash ELF. |
The page-cache overlay rules (`dirtydecrypt_payload_overlay`,
`fragnesia_payload_overlay`) are particularly high-signal: no
legitimate ELF starts with those exact 28 bytes, so a hit means the
exploit landed.
### Falco runtime detection
Falco catches the exploit **as it fires** by hooking syscalls and
namespace events. Best deploy for K8s / container hosts but works on
any modern Linux:
```bash
sudo skeletonkey --detect-rules --format=falco \
| sudo tee /etc/falco/rules.d/skeletonkey.yaml
sudo falco --validate /etc/falco/rules.d/skeletonkey.yaml
sudo systemctl reload falco # or restart, depending on distro
```
What each rule catches:
| Rule | Triggers on |
|---|---|
| `Pwnkit-style pkexec invocation` | `pkexec` spawned with empty argv (the bug's hallmark). |
| `Pwnkit-style GCONV_PATH injection` | Non-root sets `GCONV_PATH=` / `CHARSET=` before spawning a setuid binary. |
| `AF_ALG authenc keyblob installed by non-root` | `socket(AF_ALG)` by non-root — Copy Fail / GCM variant primitive. |
| `XFRM NETLINK_XFRM bind from unprivileged userns` | XFRM SA setup from non-root userns — Dirty Frag / Fragnesia primitive. |
| `/etc/passwd modified by non-root` | Post-fire signal for the whole page-cache-write family. |
| `Dirty Pipe splice from setuid/sensitive file by non-root` | `splice()` of `/etc/passwd` or `/usr/bin/su` by non-root. |
| `AF_RXRPC socket created by non-root` | DirtyDecrypt primitive — `socket(AF_RXRPC)` is nearly unheard-of in production. |
| `rxrpc security key added` | `add_key("rxrpc", …)` by non-root — DirtyDecrypt handshake setup. |
| `TCP_ULP=espintcp set by non-root` | Fragnesia trigger — flipping a TCP socket to espintcp ULP. |
| `SUID bash dropped to /tmp` | Pack2TheRoot postinst landing `/tmp/.suid_bash`. |
| `dpkg invoked by PackageKit on behalf of non-root caller` | Pack2TheRoot chain — `packagekitd → dpkg` installing a /tmp `.pk-*.deb`. |
## Day-to-day operational shape
### What "good" looks like in the SIEM
- Daily `skeletonkey --scan --json` from every host indexed
- Trend dashboard: count of VULNERABLE results by CVE over time
- Goal: every VULNERABLE → OK transition within SLA (e.g., 14 days for
patched-mainline bugs, 24h for actively-exploited)
- Alert on: any host with a result not seen yesterday (could indicate
a config drift, a new install, or a disabled mitigation)
### Auditd events from the embedded rules
After deploying `skeletonkey --detect-rules --format=auditd`:
```bash
# By module key
sudo ausearch -k skeletonkey-copy-fail -ts today
sudo ausearch -k skeletonkey-dirty-pipe -ts today
sudo ausearch -k skeletonkey-pwnkit -ts today
sudo ausearch -k skeletonkey-nf-tables-userns -ts today
sudo ausearch -k skeletonkey-overlayfs -ts today
# Anything skeletonkey-tagged in the last hour
sudo ausearch -k 'skeletonkey-*' -ts recent
# Forward to syslog (rsyslog example)
# /etc/rsyslog.d/skeletonkey.conf:
:msg, contains, "skeletonkey-" @@your-siem.example.com:514
```
### When a VULNERABLE result fires
Decision tree:
```
A scan reports VULNERABLE for module X
├── Q: Can I patch the underlying kernel / package?
│ ├── YES → schedule patch window. In the meantime:
│ │ skeletonkey --mitigate X (if supported)
│ │ Verify auditd rule for X is loaded.
│ │ Monitor for the rule key.
│ └── NO (legacy LTS, embedded device, prod freeze) →
│ skeletonkey --mitigate X (essential)
│ Compensating control: tighten LSM (SELinux/AppArmor)
│ Document in risk register
└── Q: Was this VULNERABLE before? When?
├── First time → config drift; investigate why detection now
│ produces this result
└── Persistent → mitigation isn't applied OR is being reverted
by config management; fix the config baseline
```
### Mitigation reverts
Mitigations can break legitimate functionality:
| Mitigation | Side effect |
|---|---|
| `copy_fail` blacklist algif_aead | strongSwan / IPsec breaks |
| `copy_fail` blacklist esp4/esp6 | IPsec breaks |
| `copy_fail` blacklist rxrpc | AFS / kAFS clients break |
| `copy_fail` AppArmor restrict userns=1 | bubblewrap, podman rootless break |
If you applied a mitigation and now need to revert (e.g., the kernel
patch has rolled out fleet-wide):
```bash
sudo skeletonkey --cleanup copy_fail
# OR manually:
sudo rm /etc/modprobe.d/dirtyfail-mitigations.conf
sudo rm /etc/sysctl.d/99-dirtyfail-mitigations.conf
# Reload affected modules / sysctls per your distro
```
## Per-module detection coverage
Across the 4 rule formats:
| Module | CVE | auditd | sigma | yara | falco |
|---|---|:-:|:-:|:-:|:-:|
| copy_fail | CVE-2026-31431 | ✓ | ✓ | ✓ | ✓ |
| copy_fail_gcm | (variant) | ✓ | ✓ | ✓ | ✓ |
| dirty_frag_esp | CVE-2026-43284 | ✓ | ✓ | ✓ | ✓ |
| dirty_frag_esp6 | CVE-2026-43284 | ✓ | ✓ | ✓ | ✓ |
| dirty_frag_rxrpc | CVE-2026-43500 | ✓ | ✓ | ✓ | ✓ |
| dirty_pipe | CVE-2022-0847 | ✓ | ✓ | ✓ | ✓ |
| dirtydecrypt | CVE-2026-31635 | ✓ | ✓ | ✓ | ✓ |
| fragnesia | CVE-2026-46300 | ✓ | ✓ | ✓ | ✓ |
| pwnkit | CVE-2021-4034 | ✓ | ✓ | ✓ | ✓ |
| pack2theroot | CVE-2026-41651 | ✓ | ✓ | ✓ | ✓ |
| Other 21 modules | various | ✓ | partial | — | — |
Full 4-format coverage on the 10 highest-value modules; auditd
covers everything. YARA / Falco expansion to the remaining 21 modules
is incremental contributor work (each module's `detect_yara` /
`detect_falco` field in the module struct just needs a string).
## Correlation across formats
Single-format detections are useful; the high-confidence signal is
the **correlation across formats** for the same module in a short
window. Each exploit leaves a recognisable multi-format trail:
| Exploit | falco fires | auditd fires | yara confirms |
|---|---|---|---|
| Pwnkit | `pkexec` empty argv | `execve /usr/bin/pkexec` + `GCONV_PATH=` env | gconv-modules cache in /tmp |
| Dirty Pipe | `splice()` from `/etc/passwd` | splice + write to `/etc/passwd` | UID flip in `/etc/passwd` |
| Copy Fail | `socket(AF_ALG)` | algif_aead + `ALG_SET_KEY` | UID flip in `/etc/passwd` |
| Dirty Frag (ESP) | NETLINK_XFRM sendto + TCP_ULP | XFRM_MSG_NEWSA | UID flip in `/etc/passwd` |
| DirtyDecrypt | `socket(AF_RXRPC)` + `add_key(rxrpc)` | AF_RXRPC + add_key | 120-byte ELF overwrites `/usr/bin/su` |
| Fragnesia | `TCP_ULP=espintcp` from non-root | XFRM + setsockopt(TCP_ULP) | 192-byte ELF overwrites `/usr/bin/su` |
| Pack2TheRoot | dpkg invoked by packagekitd with /tmp/.pk-*.deb | new `.deb` in `/tmp` + `chmod 4755` on `/tmp/.suid_bash` | malicious `.deb` + SUID bash both present |
If **three of the four signals** fire for the same module in the same
window, the exploit landed. **One signal alone** in a noisy
environment is more likely a tuning FP; **three signals** is incident
response.
## Worked example: catching DirtyDecrypt end-to-end
A SOC operator gets a Falco page:
```
CRITICAL AF_RXRPC socket() by non-root (user=alice proc=poc pid=44231)
```
1. **Confirm via auditd** — pull events keyed on the family:
```bash
sudo ausearch -k skeletonkey-dirtydecrypt-rxrpc -ts recent
```
Expect: `socket(...,33,...)` + subsequent `add_key("rxrpc",...)`.
2. **Confirm via yara** — scan setuid binaries for the page-cache
overlay:
```bash
yara /etc/yara/skeletonkey.yar /usr/bin/su /usr/bin/passwd
```
If `dirtydecrypt_payload_overlay` matches `/usr/bin/su`, **the
exploit landed** — the binary's page cache has been overwritten
with the 120-byte shellcode.
3. **Recover** — the on-disk binary is intact; only the page cache is
corrupted. Drop it:
```bash
sudo skeletonkey --cleanup dirtydecrypt # or: echo 3 > /proc/sys/vm/drop_caches
```
4. **Sigma hunt for lateral / repeat** — query your SIEM with the
sigma rule ID `7c1e9a40-skeletonkey-dirtydecrypt` over the last 7
days to find any other hosts.
5. **Patch.** DirtyDecrypt's mainline fix is commit `a2567217` in
Linux 7.0 — see [`CVES.md`](../CVES.md) for distro backports.
6. **Harden.** `rxrpc` is rarely needed on non-AFS hosts:
```bash
echo "blacklist rxrpc" | sudo tee /etc/modprobe.d/blacklist-rxrpc.conf
sudo update-initramfs -u
```
The same shape applies to every module: pick the auditd key, the
yara rule for the artifact, the falco rule for the runtime signal,
and the sigma rule for the hunt.
## Common false positives + tuning
| Rule key | False positive | Fix |
|---|---|---|
| `skeletonkey-copy-fail-afalg` | strongSwan, libcrypto using kernel crypto | `-F auid=` exclude service account UIDs |
| `skeletonkey-dirty-pipe-splice` | nginx, HAProxy, kTLS | `-F gid!=33 -F gid!=99` exclude web service accounts |
| `skeletonkey-pwnkit-execve` | gnome-software, polkit's own re-exec | Correlate by parent process; pkexec via gnome dbus is benign |
| `skeletonkey-nf-tables-userns` | docker rootless, podman, snap confined apps | Whitelist known userns-using service GIDs |
| `skeletonkey-overlayfs` | docker / containerd mounting overlayfs as root | The rule is intended for unprivileged-userns overlayfs mounts; add `-F auid>=1000` |
## Pre-patch quarantine pattern
If a CVE is in active exploitation and you can't patch immediately:
```bash
# Stage 1: detect
sudo skeletonkey --scan --json | jq '.modules[] | select(.cve == "CVE-XXXX")'
# Stage 2: mitigate (where supported)
sudo skeletonkey --mitigate <module>
# Stage 3: monitor — auditd rules already deployed
sudo ausearch -k 'skeletonkey-*' -ts today | grep <module>
# Stage 4: contain — temporarily restrict the trigger surface
# e.g., for nf_tables CVE-2024-1086:
echo 0 | sudo tee /proc/sys/kernel/unprivileged_userns_clone
# OR
sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=1
# Stage 5: alert
# When auditd or sigma rule fires, page on-call
```
## Maintenance contract
When SKELETONKEY ships a new module:
1. CI test passes on at least one vulnerable + patched kernel pair
2. Detection rules ship alongside (auditd + sigma minimum)
3. CVES.md row added with patch status
4. NOTICE.md credits original researcher
5. ROADMAP.md updated
Treat these as the SLA for any blue-team-facing deliverable.
## When you find a new false positive
File an issue at https://github.com/KaraZajac/SKELETONKEY/issues with:
- The exact ausearch line that fired
- The legitimate process that produced it
- Distro / kernel version
Most false-positive fixes are a `-F` filter on the embedded rule —
small, mergeable.