Skip to content

Antalya 26.3: support external paths in Iceberg tables#1859

Merged
zvonand merged 2 commits into
antalya-26.3from
feat/antalya-26.3/90740
Jun 2, 2026
Merged

Antalya 26.3: support external paths in Iceberg tables#1859
zvonand merged 2 commits into
antalya-26.3from
feat/antalya-26.3/90740

Conversation

@zvonand
Copy link
Copy Markdown
Member

@zvonand zvonand commented Jun 1, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Support Iceberg tables that have data files outside the table location or on a different object storage. Cherry-picked from ClickHouse#90740 (by @zvonand).

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

Port ClickHouse#90740 to antalya-26.3.

Iceberg tables may now reference files (data files, manifests, manifest
lists) located outside the table location, including on a different
object storage backend. Metadata paths are treated as absolute URIs and
resolved at read/delete time via new object-storage helpers
(`SchemeAuthorityKey`, `resolveObjectStorageForPath`, `SecondaryStorages`),
with the cluster-function protocol bumped to
`DBMS_CLUSTER_PROCESSING_PROTOCOL_VERSION_WITH_ICEBERG_ABSOLUTE_PATH`.

Adds the `s3_propagate_credentials_to_other_storages` setting to optionally
copy base S3 credentials when creating secondary storages.

Notes on porting to this branch:
- Skipped the `ExpireSnapshotsExecute`, `RemoveOrphanFilesExecute` and
  `SnapshotFilesTraversal` files: this functionality does not exist in
  `antalya-26.3`. The `executeCommand` branch using them was dropped and
  the existing `expireSnapshots` implementation is kept.
- Dropped the `S3UriStyle uri_style` `S3::URI` parameter (from an unrelated
  upstream change not in this branch); only `enable_url_encoding` is added.
- Dropped the upstream-only `_path` virtual column `storage_id` field,
  which is not present in `VirtualsForFileLikeStorage` here.
- Folded the metadata-path preference into the existing `getFileIdentifier`
  helper in the stable task distributor rather than the upstream inline
  call sites.
- Updated `Mutations.cpp` (`expireSnapshots`) callers for the new
  `getManifestList` / `getManifestFileEntriesHandle` signatures.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zvonand zvonand added port-antalya PRs to be ported to all new Antalya releases forwardport This is a frontport of code that existed in previous Antalya versions labels Jun 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Workflow [PR], commit [85dc463]

Fixes `04034_iceberg_spark_style_location` (S3_ERROR 404 reading
`warehouse/db/spark_table/metadata/snap-*.avro`).

When an Iceberg table's metadata `location` differs from where the files
actually live (e.g. a Spark-relocated table whose `location` is
`s3a://spark-bucket/warehouse/db/spark_table` while the objects are in the
configured base storage), the manifest-list / manifest / data paths in the
metadata are spelled with that foreign prefix.

`tryResolveObjectStorageForPath` matched such a path against `table_location`
and returned the raw URI key on the base storage, so reads hit a
non-existent key and failed with a 404. The raw key is only valid for paths
whose bucket matches the base storage (handled by the earlier base-bucket
branch). For a path that matches `table_location` but not the base bucket,
only `IcebergPathResolver::resolve` can map it (strip `table_location`,
prepend `table_root`), so defer to it by returning `std::nullopt`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@zvonand zvonand merged commit 6a83f97 into antalya-26.3 Jun 2, 2026
306 of 313 checks passed
@zvonand
Copy link
Copy Markdown
Member Author

zvonand commented Jun 2, 2026

no related fails

@zvonand zvonand added the verified Approved for release label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya-26.3 antalya-26.3.10.20001 forwardport This is a frontport of code that existed in previous Antalya versions port-antalya PRs to be ported to all new Antalya releases verified Approved for release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants