Fix aggregation flow with remote initiator#1872
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e72ebfe177
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (getClusterName(context).empty() // Not cluster request | ||
| && !(context->getSettingsRef()[Setting::object_storage_remote_initiator] // Not request with remote initiator | ||
| && !context->getSettingsRef()[Setting::object_storage_remote_initiator_cluster].value.empty())) |
There was a problem hiding this comment.
Keep unsupported storages at FetchColumns
When object_storage_remote_initiator is set for a configuration that does not support cluster mode, such as icebergLocal or other local data-lake storages where getClusterName is empty because isClusterSupported is false, this new exception makes getQueryProcessingStage report WithMergeableState. IStorageCluster::read still immediately falls back to pure_storage for those configurations, and StorageObjectStorage::read ignores processed_stage, so aggregate queries can be planned as if partial aggregation happened remotely even though only raw rows were read. Please only return the distributed stage when the subsequent read path will actually use the remote/cluster execution path.
Useful? React with 👍 / 👎.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix aggregation flow with remote initiator
Documentation entry for user-facing changes
With
object_storage_remote_initiatorbut withoutobject_storage_clustersettingStorageObjectStorageCluster::getQueryProcessingStagereturnedQueryProcessingStage::Enum::FetchColumns, as result nodes sent all rows on initiator and aggregation executed on initiator.Now method returns
QueryProcessingStage::Enum::WithMergeableStateis proper cases, and pre-aggregation executed on nodes.CI/CD Options
Exclude tests:
Regression jobs to run: