Files
SKELETONKEY/modules/copy_fail_family/docs/RESEARCH.md
T

325 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# DIRTYFAIL — research notes
This document captures kernel-source audits and analysis adjacent to
the published CVEs (CVE-2026-31431 / CVE-2026-43284 / CVE-2026-43500).
It's a living research log, not a vendor advisory: findings here are
based on reading mainline kernel source and the disclosed write-ups,
and may need re-verification as the kernel evolves.
---
## §1. Adjacent kernel paths — audit for the same skb_cow_data() bypass pattern
### TL;DR
Ten kernel paths beyond the published CVEs were audited for the
same in-place-AEAD-over-splice-pinned-pages bug class. **All ten
are structurally immune.** No undisclosed CVE candidates surfaced
in this audit; the bug class is genuinely tightly scoped to the
three published sinks plus the algif_aead authencesn/rfc4106-gcm
primitives.
### The vulnerable pattern
The CVE-2026-43284-class bug requires all four of:
1. **In-place AEAD**`aead_request_set_crypt(req, src, dst, ...)`
where `src == dst` or the scatterlists alias the same memory.
2. **Conditional skip-COW** — input handler has a branch that bypasses
`skb_cow_data()` on certain skb shapes (typically: non-linear with
no frag_list).
3. **`skb_to_sgvec` over skb frags** — the scatterlist passed to the
AEAD is built directly from the skb's frags, so splice-pinned page
references end up in it.
4. **Userspace path to the skb's frags**`splice(2)`, `sendfile(2)`,
or `sendmsg(MSG_SPLICE_PAGES)` can deliver attacker-controlled
page-cache pages into those frags.
Removing any one of the four breaks the chain. The published CVEs are
the three sinks where all four conditions align (esp_input, esp6_input,
rxkad_verify_packet_1) plus the algif_aead authencesn / rfc4106-gcm
primitives that share the in-place destination scatterlist pattern.
### §1.1 Path-by-path verdict
| Path | In-place crypto? | skb_cow_data | Splice-reachable? | Verdict |
|---|---|---|---|---|
| esp_input (esp4) | ✅ | conditional skip | yes | **CVE-2026-43284** (patched) |
| esp6_input | ✅ | conditional skip | yes | **CVE-2026-43284 v6** (patched) |
| algif_aead authencesn | ✅ | n/a (different path) | yes via splice→AF_ALG | **CVE-2026-31431** (patched) |
| algif_aead rfc4106-gcm | ✅ | n/a | yes | **Copy Fail GCM variant** (patched as side-effect of CF revert) |
| rxkad_verify_packet_1 | ✅ | conditional skip | yes via RxRPC handshake | **CVE-2026-43500** (NOT patched as of 2026-05-09) |
| **ah_input (ah4 + ah6)** | ✅ (HMAC, not decrypt) | **UNCONDITIONAL** | n/a | NOT vulnerable — structurally immune |
| **ipcomp_input** | ❌ (decompress, separate output pages) | conditional skip | n/a (output is fresh page) | NOT vulnerable — separate dst |
| **macsec_decrypt** | ✅ | **UNCONDITIONAL** | no — rx skbs come from netdev | NOT vulnerable — structurally immune |
| **tls_sw recv decrypt** | ✅ | unconditional, also rx-only | no — rx skbs come from TCP rx ring | NOT vulnerable |
| **tls_sw send encrypt + MSG_SPLICE_PAGES** | YES (read-only on user pages) | n/a (msg_en allocated separately) | yes (msg_pl) but only as src | NOT vulnerable — separate src/dst |
| **WireGuard `decrypt_packet`** | ✅ ChaCha20Poly1305 in-place | **UNCONDITIONAL** at line 252 | yes via UDP rx (but COW protects) | NOT vulnerable — structurally immune |
| **algif_skcipher `_skcipher_recvmsg`** | ✅ symmetric in-place possible | n/a (different module structure) | src yes (TX SGL), dst no (recv iovec) | NOT vulnerable — separate src/dst |
| **espintcp** (ESP-in-TCP) | n/a (delegates) | n/a | reaches esp_input via xfrm_rcv_encap | inherits f4c50a4034e6 patch — NOT a new CVE |
| **OpenVPN kernel offload `ovpn_aead_decrypt`** | ✅ AEAD in-place | **UNCONDITIONAL** at line 210 | yes via UDP rx (but COW protects) | NOT vulnerable — structurally immune |
| **SCTP-AUTH `sctp_auth_calculate_hmac`** | HMAC only (no decrypt, no destination write into skb data frags) | n/a | n/a — digest writes to auth chunk header (kernel-allocated), not data frags | NOT vulnerable — read-only over data |
### §1.2 Eliminated paths — why each is immune
**`ah_input` (net/ipv4/ah4.c, net/ipv6/ah6.c)** — IPsec Authentication
Header. Calls `skb_cow_data(skb, 0, &trailer)` UNCONDITIONALLY before
`skb_to_sgvec_nomark` builds the HMAC scatterlist. No skip-cow branch.
Splice-pinned pages would always be copied into a private buffer
before HMAC verification.
**`xfrm_ipcomp.c`** — IPCOMP decompression has a conditional skip-cow
branch, but the output is allocated as a fresh kernel page
(`alloc_page(GFP_ATOMIC)`) and the destination scatterlist `dsg` is
built separately from the input scatterlist `sg`. Even with
splice-pinned input pages, decompression output goes to fresh pages.
Not in-place over input.
**`macsec_decrypt` (drivers/net/macsec.c)** — MACsec receive AEAD.
Calls `skb_cow_data(skb, 0, &trailer)` unconditionally before
`skb_to_sgvec` and the in-place decrypt. Additionally: macsec rx
skbs come from netdev rx, not from userspace splice — the attacker
has no path to plant a page-cache page reference.
**`tls_sw_recvmsg` (net/tls/tls_sw.c)** — kTLS receive AEAD.
kernel.org docs: "To decrypt 'in place' kTLS calls skb_cow_data()."
COW is unconditional on the rx path. Additionally: TLS rx skbs come
from the TCP rx queue, not from splice — the only way a user can put
a page-cache page reference into a TCP rx skb is via rare
`SO_PEEK_OFF` / `MSG_PEEK` paths or kernel-side socket forwarding,
neither of which gives the attacker control.
### §1.3 kTLS send via MSG_SPLICE_PAGES — closest near-miss
The kTLS *send* path was modified in 2023 ("splice, net: Handle
MSG_SPLICE_PAGES in AF_TLS", LWN 933386) to support
`MSG_SPLICE_PAGES`, which is the same primitive Dirty Frag and Copy
Fail abuse. This was the most plausible adjacent candidate.
**Resolved: not vulnerable.** Direct reading of `net/tls/tls_sw.c`:
- `tls_sw_sendmsg_splice()` adds the user's spliced pages to `msg_pl`
(the plaintext sk_msg buffer) via `sk_msg_page_add()`.
- `tls_alloc_encrypted_msg()` calls
`sk_msg_alloc(sk, msg_en, len, 0)`**fresh kernel pages** for the
encrypted buffer.
- `tls_push_record()` chains the scatterlists:
```c
sg_chain(rec->sg_aead_out, 2, &msg_en->sg.data[i]);
```
- `tls_do_encryption()`:
```c
aead_request_set_crypt(aead_req, rec->sg_aead_in,
rec->sg_aead_out, data_len, rec->iv_data);
```
- `sg_aead_in` (chained from msg_pl, contains user's spliced page)
≠ `sg_aead_out` (chained from msg_en, kernel-allocated pages).
The encrypt READS the user's spliced /etc/passwd page but WRITES
ciphertext to `msg_en`'s kernel-allocated pages. The user's
page-cache page is never modified. This is exactly the defense the
algif_aead patch (a664bf3d603d) implemented when it reverted to
out-of-place AEAD; kTLS has had it from inception.
Compare to the vulnerable `esp_input` pattern:
```c
/* vulnerable: src == dst */
skb_to_sgvec(skb, sg, ...);
aead_request_set_crypt(req, sg, sg, ...);
```
```c
/* safe: src ≠ dst */
sg_chain(sg_aead_in, ..., msg_pl); /* user spliced pages */
sg_chain(sg_aead_out, ..., msg_en); /* kernel private pages */
aead_request_set_crypt(req, sg_aead_in, sg_aead_out, ...);
```
### §1.3a WireGuard receive — `decrypt_packet()`
ChaCha20Poly1305 in-place AEAD on incoming UDP skbs. Confirmed
**not vulnerable** — `drivers/net/wireguard/receive.c:232277`:
```c
static bool decrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair)
{
struct scatterlist sg[MAX_SKB_FRAGS + 8];
/* ... */
offset = -skb_network_offset(skb);
skb_push(skb, offset);
num_frags = skb_cow_data(skb, 0, &trailer); /* line 252, UNCONDITIONAL */
/* ... */
sg_init_table(sg, num_frags);
if (skb_to_sgvec(skb, sg, 0, skb->len) <= 0)
return false;
if (!chacha20poly1305_decrypt_sg_inplace(sg, skb->len, NULL, 0,
PACKET_CB(skb)->nonce,
keypair->receiving.key))
return false;
```
`skb_cow_data` at line 252 is UNCONDITIONAL — no skip-cow branch. By
the time the in-place AEAD runs, any splice-pinned pages have already
been copied into kernel-private pages. Same defensive pattern as
AH, MACsec, kTLS rx.
### §1.3b algif_skcipher — `_skcipher_recvmsg()`
The companion module to algif_aead, exposing symmetric ciphers
(AES-CBC, AES-CTR, etc.) over AF_ALG. Same author and patchset era
as the in-place optimization that introduced Copy Fail (2017,
72548b093ee3); the Copy Fail upstream fix only reverted algif_aead,
so worth verifying algif_skcipher independently.
`crypto/algif_skcipher.c:151152`:
```c
skcipher_request_set_crypt(&areq->cra_u.skcipher_req, areq->tsgl,
areq->first_rsgl.sgl.sgt.sgl, len, ctx->iv);
```
- `areq->tsgl` = TX SGL, populated via `af_alg_pull_tsgl()`. CAN
contain user-spliced page-cache pages (sendmsg + splice path).
- `areq->first_rsgl.sgl.sgt.sgl` = RX SGL, populated via
`af_alg_get_rsgl(sk, msg, ...)` from the user's `recv()` iovec,
via `iov_iter_get_pages` mapping the calling process's anonymous
memory.
The cipher operation reads from `tsgl` (potentially user-spliced
page-cache pages) and writes to `rsgl` (user's recv buffer in their
own anonymous memory). **src ≠ dst; output never lands on
splice-pinned page-cache pages.**
Why this differs from algif_aead's Copy Fail: the algif_aead bug was
specifically about the `authencesn` template internally chaining TAG
pages into the destination SGL extension (`req->dst` extends past
the end of `req->src`'s last page into chained tag pages, which
happen to be the source's spliced pages). Plain skcipher has no AEAD
tags, no chained scratch — clean src/dst separation. **Not
vulnerable.**
### §1.3c espintcp — IPsec ESP over TCP
`net/xfrm/espintcp.c` is a *transport-layer wrapper* — it does no
cryptographic work itself. The `handle_esp()` function delegates
straight to `xfrm6_rcv_encap` / `xfrm4_rcv_encap`, which call into
the standard `esp_input()` / `esp6_input()` handlers. Any skb that
reaches the ESP path through espintcp is processed by the same code
that was patched by f4c50a4034e6 (SKBFL_SHARED_FRAG check).
**Verdict: not a separate CVE.** On unpatched kernels, espintcp is
just an alternative transport for the existing CVE-2026-43284 sink
(esp_input). On patched kernels the same fix covers both UDP and TCP
encapsulation. The SHARED_FRAG flag is set wherever splice can plant
pages into TCP send buffers, and the producer-side flagging
propagates through TCP into the espintcp path.
### §1.3d OpenVPN kernel offload — `ovpn_aead_decrypt()`
New module in 6.16+ implementing OpenVPN's data channel
(ChaCha20Poly1305 / AES-GCM) in the kernel. Receive AEAD path is in
`drivers/net/ovpn/crypto_aead.c`:
```c
/* line ~210 */
nfrags = skb_cow_data(skb, 0, &trailer); /* UNCONDITIONAL */
/* ... */
/* line ~228 */
skb_to_sgvec_nomark(skb, sg + 1, payload_offset, payload_len);
/* ... */
/* line ~239 */
aead_request_set_crypt(req, sg, sg, payload_len + tag_size, iv);
```
In-place AEAD (`sg, sg`) — but `skb_cow_data()` is called
unconditionally before `skb_to_sgvec_nomark` builds the scatterlist.
Splice-pinned pages always copied to kernel-private memory before
the AEAD runs. **Not vulnerable.** Same defensive pattern as
WireGuard, AH, MACsec, kTLS rx.
### §1.3e SCTP-AUTH HMAC validation
`net/sctp/auth.c:sctp_auth_calculate_hmac()` (lines 606642) computes
HMAC over an SCTP AUTH chunk:
```c
data_len = skb_tail_pointer(skb) - (unsigned char *)auth;
digest = (u8 *)(&auth->auth_hdr + 1);
hmac_sha1_usingrawkey(asoc_key->data, asoc_key->len,
(const u8 *)auth, data_len, digest);
```
The HMAC is computed READ-ONLY over the skb's chunk data. The
digest output is written to the auth chunk's digest field
(`&auth->auth_hdr + 1`), which on the SEND path lives in
kernel-allocated chunk header memory — not in any user-spliced
data fragment. On the RECEIVE path, verification computes HMAC
over received data and compares to the sender-provided digest in a
private buffer — pure read.
The bug class requires a kernel-side WRITE to a splice-pinned page;
SCTP-AUTH only ever READS from skb data and writes the digest to a
kernel-allocated chunk header. **Not vulnerable.**
### §1.4 The protective patterns that distinguish safe from vulnerable
Every safe path on the list achieves immunity through one of three
mechanisms, each of which removes one of the four required conditions:
1. **Unconditional `skb_cow_data()`** before any in-place crypto —
AH, MACsec, kTLS rx. (Removes condition 2.)
2. **Separate destination scatterlist** allocated from kernel-private
pages — kTLS tx, IPCOMP, post-patch algif_aead.
(Removes condition 1.)
3. **The in-place crypto target is fundamentally not a splice-able
skb** — kTLS rx skbs come from TCP rx, not user splice.
(Removes condition 4.)
### §1.5 Out-of-scope or low-value candidates
The candidates that remained after §1.3a-e were all eliminated as
not worth a deeper audit:
- **AF_SMC encryption** — uses kTLS/ULP underneath, already covered
by the kTLS audit (§1.3 / §1.4b).
- **io_uring crypto extensions** — would inherit AF_ALG semantics,
already covered by the algif_skcipher audit (§1.3b).
- **Bluetooth CMTP/HIDP crypto** — privileged-only (HCI device
access), not an unprivileged-LPE vector.
- **Kernel TLS NIC offload** — encryption runs on the NIC firmware,
different threat surface entirely (firmware-side bug, not
page-cache-write).
- **dm-crypt / fscrypt** — block-layer / filesystem-layer
encryption. Different threat model; user can't splice arbitrary
page-cache pages into block requests in any meaningful way.
### §1.6 Methodology
For each candidate path, read the input handler and ask:
1. Does it call `skb_cow_data()` BEFORE building the AEAD
scatterlist?
2. Is there a conditional branch (typically based on `skb_cloned`,
`skb_has_frag_list`, `skb_is_nonlinear`) that bypasses (1)?
3. Is the resulting scatterlist used as BOTH src AND dst of
`aead_request_set_crypt()` / equivalent?
4. Can a userspace primitive (`splice(2)`, `sendfile(2)`,
`sendmsg(MSG_SPLICE_PAGES)`, AF_ALG send) deliver
attacker-controlled pages into the input skb's frags?
All four must be true for the bug class to apply. A single "no" is
sufficient for "not vulnerable."
---
## §2. References
- V4bel/dirtyfrag write-up — [github.com/V4bel/dirtyfrag/blob/master/assets/write-up.md](https://github.com/V4bel/dirtyfrag/blob/master/assets/write-up.md)
- Theori/Xint Copy Fail disclosure — [xint.io/blog/copy-fail-linux-distributions](https://xint.io/blog/copy-fail-linux-distributions)
- LWN — Replace sendpage with sendmsg(MSG_SPLICE_PAGES) — [lwn.net/Articles/928487](https://lwn.net/Articles/928487/)
- LWN — Handle MSG_SPLICE_PAGES in AF_TLS — [lwn.net/Articles/933386](https://lwn.net/Articles/933386/)
- TLS 1.3 Rx improvements (Kicinski) — [people.kernel.org/kuba/tls-1-3-rx-improvements-in-linux-5-20](https://people.kernel.org/kuba/tls-1-3-rx-improvements-in-linux-5-20)
- 0xdeadbeefnetwork Copy_Fail2 (GCM variant) — [github.com/0xdeadbeefnetwork/Copy_Fail2-Electric_Boogaloo](https://github.com/0xdeadbeefnetwork/Copy_Fail2-Electric_Boogaloo)
- Linux source (torvalds/master) — `net/ipv4/ah4.c`, `net/ipv6/ah6.c`, `net/xfrm/xfrm_ipcomp.c`, `drivers/net/macsec.c`, `net/tls/tls_sw.c`