Skip to content

fix(httpsig): percent-encode non-token keys and body-part names#995

Open
timothynodes wants to merge 1 commit into
permaweb:edgefrom
timothynodes:fix/httpsig-weird-key-header-encoding
Open

fix(httpsig): percent-encode non-token keys and body-part names#995
timothynodes wants to merge 1 commit into
permaweb:edgefrom
timothynodes:fix/httpsig-weird-key-header-encoding

Conversation

@timothynodes

Copy link
Copy Markdown

Summary

Two related fixes for AO-Core message keys whose names are not valid HTTP header field names (spaces, emoji, uppercase, etc.).

hb_escape.erl

Add is_http_token/1 and encode_http_key/1. A key is percent-encoded only when it is not already a valid lowercase HTTP token, so already-valid keys stay byte-identical and human-readable on the wire while everything else becomes a legal header name. The transformation is reversed by the existing decode/1.

dev_httpsig_conv.erl

  • encode_ids/1 now percent-encodes any non-token key instead of only ID-shaped keys (which were matched by byte size). Previously a tag named e.g. my <emoji> tag was emitted verbatim as an illegal header and rejected downstream — observed as a 502 from a fronting proxy.
  • encode_body_part/4 / from_body_part/3 now symmetrically percent-encode and decode the Content-Disposition name. Large (> MAX_HEADER_LENGTH) and/or nested values are lifted into their own body part keyed by their full flat path — a name encode_ids/1 never sees — so their raw bytes reached the wire and crashed the structured-field parser on decode (parse_string/2 only accepts 0x200x7E). This mirrors the existing percent-encoding of committed key names in dev_httpsig_siginfo. The / path separator is preserved so nested-path splitting still works.

Compatibility

Keys/part-names that are already valid (lowercase letters, digits, -_. and the / separator) are byte-unchanged, so existing signed messages and their IDs are unaffected. Only names that previously could not round-trip at all change form.

Tests

Round-trip tests added for both fixes; existing tests still pass.

  • hb_escape: is_http_token_test, encode_http_key_test
  • dev_httpsig_conv: encode_ids_round_trips_weird_keys_test, encode_body_part_escapes_weird_name_test, encode_large_nested_weird_key_round_trips_test (end-to-end httpsig round-trip of a >4096-byte nested weird-named value)

🤖 Generated with Claude Code

Two related fixes for AO-Core keys whose names are not valid HTTP header
field names (spaces, emoji, uppercase, etc.).

hb_escape: add is_http_token/1 and encode_http_key/1. A key is encoded
only when it is not already a valid lowercase HTTP token, so already-valid
keys stay human-readable on the wire while everything else becomes a legal
header name. Reversed by the existing decode/1.

dev_httpsig_conv: generalize encode_ids/1 to percent-encode any non-token
key rather than only ID-shaped keys (matched by byte size). Previously a
tag named e.g. "my <emoji> tag" was emitted verbatim as an illegal header
and rejected downstream, surfacing as a 502 from a fronting proxy.
Symmetrically, escape the Content-Disposition `name' in encode_body_part/4
and decode it in from_body_part/3: large (> MAX_HEADER_LENGTH) and/or
nested values are lifted into their own body part keyed by their full flat
path, which encode_ids/1 never sees, so their raw bytes reached the wire
and crashed the structured-field parser on decode. This mirrors the
existing percent-encoding of committed key names in dev_httpsig_siginfo.

Add round-trip tests for both fixes; existing keys are byte-unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@timothynodes

timothynodes commented Jun 27, 2026

Copy link
Copy Markdown
Author

Here's the test description with a concise Expected result:


Test: httpsig non-token-key header encoding (502 regression)

Purpose
Verify /~cache@1.0/read no longer returns 502 when a cached item has a tag whose key name is not a valid HTTP token (space, emoji, uppercase). Exercises commit 097268a.

Setup

  • hb.timothynode.com → node with the fix.
  • hc.timothynode.com → node without the fix.
  • Cache key: UB33qavPDZkjCQ0Ezs0TqIsJs3XVjPpyibumrKSL7F8 (contains non-token tag names).

Steps
curl -s -o /dev/null -w "%{http_code}\n" "https://hb.timothynode.com/~cache@1.0/read?read=UB33qavPDZkjCQ0Ezs0TqIsJs3XVjPpyibumrKSL7F8"

curl -s -o /dev/null -w "%{http_code}\n" "https://hc.timothynode.com/~cache@1.0/read?read=UB33qavPDZkjCQ0Ezs0TqIsJs3XVjPpyibumrKSL7F8"

Expected result

┌──────┬─────┬────────┐
│ Node │ Fix │ Status │
├──────┼─────┼────────┤
│ hb │ yes │ 200 │
├──────┼─────┼────────┤
│ hc │ no │ 502 │
└──────┴─────┴────────┘

Observed ✅ 200 (fixed) vs 502 (unfixed) — matches expectation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant