Every kernel-LPE module that uses Linux-only headers (splice, posix_fadvise,
linux/netlink.h, sys/ptrace.h, etc.) now follows the same #ifdef __linux__
pattern the new modules already used: Linux body in the ifdef, stub
detect/exploit/cleanup returning SKELETONKEY_PRECOND_FAIL on non-Linux,
platform-neutral rule strings + module struct + register fn left outside.
14 modules wrapped:
dirty_pipe (already done above), af_packet, af_packet2,
cgroup_release_agent, cls_route4, dirty_cow, fuse_legacy,
netfilter_xtcompat, nf_tables, nft_fwd_dup, nft_payload,
overlayfs, overlayfs_setuid, ptrace_traceme.
Several modules previously had ad-hoc partial stubs (af_packet2 faked
SIOCSIFFLAGS/MAP_LOCKED, netfilter_xtcompat faked sysv-msg syscalls,
the nft_* modules had 3 partial __linux__ islands each, fuse_legacy /
nf_tables had inner-only ifdef blocks) — all replaced with the uniform
outer-wrap shape from dirty_pipe / dirtydecrypt / fragnesia / pack2theroot.
Where a module includes core/kernel_range.h, core/finisher.h, or
core/offsets.h, those are now inside the ifdef block as well — silences
clangd's "unused-includes" LSP warning on macOS while keeping them
present for the real Linux build.
No exploit logic, constant, struct, shellcode byte, or rule string was
modified — only include placement and ifdef markers.
Build verification:
macOS (local): make clean && make → Mach-O x86_64, 31 modules
registered, --scan reports each Linux-only module as
"Linux-only module — not applicable here".
Linux (docker gcc:latest + libglib2.0-dev): make clean && make →
ELF 64-bit, 31 modules. Exploit code paths unchanged.
The README has been claiming "each module credits the original CVE
reporter and PoC author in its NOTICE.md" since v0.1.0, but only
copy_fail_family actually shipped one. Fixed.
modules/<name>/NOTICE.md (×19 new + 1 existing): per-module
research credit covering CVE ID, discoverer, original advisory
URL where public, upstream fix commit, IAMROOT's role.
iamroot.c: new --dump-offsets subcommand. Resolves kernel offsets
via the existing core/offsets.c four-source chain (env →
/proc/kallsyms → /boot/System.map → embedded table), then emits
a ready-to-paste C struct entry for kernel_table[]. Run once
as root on a target kernel build; upstream via PR. Eliminates
fabricating offsets — every shipped entry traces back to a
`iamroot --dump-offsets` invocation on a real kernel.
docs/OFFSETS.md: documents the --dump-offsets workflow.
CVES.md: notes the NOTICE.md convention + offset dump tool.
iamroot.c: bump IAMROOT_VERSION 0.3.0 → 0.3.1.
Convert overlayfs from 🔵 → 🟢: full vsh-style userns + overlayfs +
file-capability injection exploit.
Sequence:
1. mkdtemp workdir; gcc-compile a minimal payload that
setresuid(0,0,0) + execle(/bin/sh, -p)
2. fork child; child unshares(CLONE_NEWUSER | CLONE_NEWNS),
writes /proc/self/{setgroups,uid_map,gid_map} mapping outer uid
to userns-root
3. child mounts overlayfs with lower/upper/work layout
4. child copies payload binary into merged/payload — this writes
to host's upper/payload via the overlay
5. child writes security.capability xattr with VFS_CAP_REVISION_2
blob granting cap_setuid+ep on merged/payload — the BUG persists
this xattr to the host fs entry
6. child exits; parent verifies xattr via getxattr on upper/payload
7. parent execve's upper/payload from outside userns → has
cap_setuid effective → setuid(0) → /bin/sh -p with uid=0
- libcap-less setcap: build VFS_CAP_REVISION_2 blob in-place
(cap_setuid bit 7, cap_setgid bit 6, effective flag set in
magic_etc), write via setxattr(security.capability).
- which_gcc() fallback to /usr/bin/cc, /bin/gcc, etc.; tries
-static first, falls back to dynamic link if static unavailable.
- Re-runs detect() to refuse on patched / non-Ubuntu hosts.
- Cleanup on failure: rmdir/unlink the workdir tree.
- Removed unused write_uid_gid_map() helper (logic now inline in
child since we self-write the maps post-unshare).
Verified end-to-end on Debian kctf-mgr:
iamroot --exploit overlayfs --i-know
→ 'not Ubuntu — bug is Ubuntu-specific' → 'refusing'. Correct.
Path buffers oversized vs. mkdtemp template to silence GCC
-Wformat-truncation noise.
CVES.md: overlayfs 🔵 → 🟢.
10th module. Ubuntu-specific userns + overlayfs LPE that injects file
capabilities cross-namespace.
- modules/overlayfs_cve_2021_3493/iamroot_modules.{c,h}:
- is_ubuntu() — parses /etc/os-release for ID=ubuntu or
ID_LIKE=ubuntu. Non-Ubuntu hosts get IAMROOT_OK immediately (the
bug is specific to Ubuntu's modified overlayfs).
- unprivileged_userns_clone gate — sysctl=0 → PRECOND_FAIL
- Active probe (--active): forks a child that enters userns +
mountns and attempts the overlayfs mount inside /tmp. Mount
success on Ubuntu = VULNERABLE. Mount denied = patched / AppArmor
block. Child-isolated so parent's namespace state is untouched.
- Version fallback: kernel < 5.13 = vulnerable-by-inference for
Ubuntu kernels; recommend --active for confirmation.
- Exploit: detect-only stub. Reference vsh's exploit-cve-2021-3493
for full version (mount overlayfs in userns, drop binary with
cap_setuid+ep into upper layer, re-exec outside ns).
- Embedded auditd rules: mount(overlay) syscall + security.capability
xattr writes (the exploit's two-step footprint).
Verified end-to-end on kctf-mgr (Debian):
iamroot --scan → 'not Ubuntu — bug is Ubuntu-specific' → IAMROOT_OK
Module count: 10. Active-probe pattern now applies to dirty_pipe,
entrybleed, and overlayfs (and copy_fail_family via existing
dirtyfail_active_probes global). Detect quality across the corpus
materially improved this session.