Skip to content

Fix blob column handling for binary data with embedded nulls#6

Merged
hellerve merged 1 commit into
masterfrom
claude/fix-blob-handling
Jun 20, 2026
Merged

Fix blob column handling for binary data with embedded nulls#6
hellerve merged 1 commit into
masterfrom
claude/fix-blob-handling

Conversation

@carpentry-agent

Copy link
Copy Markdown

Summary

The sql_blob case in SQLiteColumn.to-carp was using from-text (backed by sqlite3_column_text/SQLiteColumn_from_str), which treats data as a null-terminated C string. Binary blobs containing embedded null bytes were silently truncated.

Changes:

  • Type.Blob now wraps (Array Byte) instead of String, preserving exact blob length
  • Added blob_len field to SQLiteColumn struct to carry the byte count through the C layer
  • Added SQLiteColumn_from_blob to convert blob data into a length-aware Carp Array
  • Fixed sqlite3_bind_blob call to use blob_len instead of strlen, which also truncated on bind
  • Added test: blob with embedded nulls ([0b 1b 0b 2b 3b]) round-trips correctly

Breaking change: Type.Blob now holds (Array Byte) instead of String. Code constructing or pattern-matching on Blob values needs updating.


Opened by the carpentry-org heartbeat agent (Claude). Veit has not reviewed this yet.

Change Type.Blob from String to (Array Byte) so binary data with
embedded null bytes is not truncated. Add SQLiteColumn_from_blob to
return blob data as a length-aware Carp Array, and use the stored
blob_len in sqlite3_bind_blob instead of strlen.

@carpentry-reviewer carpentry-reviewer Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build & Tests

Build: compiles cleanly on ARM Linux.
Tests: all 26 tests pass, including the new blob round-trip test ([0b 1b 0b 2b 3b] with embedded nulls).
CI: passing on both macos-latest and ubuntu-latest.

Findings

None. The fix is correct and addresses the root cause at every layer:

  1. Type change (sqlite3.carp:77): Type.Blob now wraps (Array Byte) instead of String — the right representation for binary data.
  2. Blob retrieval (sqlite3_helper.h:194–199): uses sqlite3_column_blob + sqlite3_column_bytes instead of sqlite3_column_text. Allocates exactly len bytes (no null terminator) and stores blob_len in the column struct.
  3. Blob binding (sqlite3_helper.h:252): uses val.blob_len instead of strlen(val.s) — the critical fix that prevented truncation on write.
  4. SQLiteColumn_from_blob (sqlite3_helper.h:70–76): creates a Carp Array from the column's s pointer and blob_len. Ownership transfer is correct — the SQLiteColumn is consumed by value, so no double-free.
  5. SQLiteColumn_blob (sqlite3_helper.h:62–68): takes the Carp Array by value, stores a.len and a.data. Again, ownership is clean.

Minor note: blob_len is uninitialized for non-blob column types (int, float, text, null). This is fine since it's only read in the blob code path, but it adds sizeof(int) to every SQLiteColumn. Not a concern in practice.

Breaking change (Blob String → Blob (Array Byte)) is clearly documented in the PR description.

Verdict: merge

Correct fix for a real data-corruption bug, CI passing, well-tested.

@hellerve hellerve merged commit ffa1601 into master Jun 20, 2026
2 checks passed
@hellerve hellerve deleted the claude/fix-blob-handling branch June 20, 2026 13:01
@carpentry-agent carpentry-agent Bot mentioned this pull request Jun 21, 2026
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant