What's the issue?
When using .fetch_column_metadata() together with DBTProjectComponent, Dagster can emit incorrect upstream AssetKeys in column lineage metadata.
The lineage builder in _build_column_lineage_metadata() resolves upstream dependencies by calling dagster_dbt_translator.get_asset_key(parent_resource_props) in dagster_dbt/core/dbt_cli_event.py. However, when assets are defined through DBTProjectComponent, the effective translation logic is applied through DbtProjectComponentTranslator.get_asset_spec(), not through get_asset_key().
Because of that mismatch, the upstream AssetKeys recorded in column lineage metadata can differ from the actual translated asset keys registered in the asset graph. The result is that Dagster emits upstream column references that point to non-existent assets, so column lineage does not resolve correctly.
Relevant references:
_build_column_lineage_metadata() calls dagster_dbt_translator.get_asset_key(parent_resource_props):
python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_cli_event.py
.fetch_column_metadata() builds lineage through _build_column_lineage_metadata():
python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_event_iterator.py
DbtProjectComponentTranslator applies component translation in get_asset_spec():
python_modules/libraries/dagster-dbt/dagster_dbt/components/dbt_project/component.py
DagsterDbtTranslator.get_asset_spec() uses spec.key, while get_asset_key() remains the default asset-key path:
python_modules/libraries/dagster-dbt/dagster_dbt/dagster_dbt_translator.py
What did you expect to happen?
I expected column lineage to use the same translated upstream AssetKeys that DBTProjectComponent uses when defining assets, so lineage metadata would point to real upstream assets in the asset graph.
How to reproduce?
- Define dbt assets using
DBTProjectComponent.
- Configure component translation so asset keys differ from the default dbt-to-Dagster translation.
- Materialize the assets with
.fetch_column_metadata().
- Inspect the emitted column lineage metadata.
- Observe that upstream column dependencies reference default-derived
AssetKeys instead of the translated keys actually present in the asset graph.
This causes lineage entries to point to non-existent upstream assets.
Dagster version
1.13.3
Deployment type
Dagster Cloud
Deployment details
No response
Additional information
I traced this to the interaction between column-lineage generation and component-based translation:
_build_column_lineage_metadata() constructs TableColumnDep entries using dagster_dbt_translator.get_asset_key(parent_resource_props).
DbtProjectComponentTranslator applies custom translation through get_asset_spec(), including user-provided component translation that can change the final asset key.
- There are already tests asserting that
DbtProjectComponentTranslator must be preserved in asset spec metadata so execution uses the component translator rather than the base translator:
python_modules/libraries/dagster-dbt/dagster_dbt_tests/components/test_dbt_project_component.py
- Existing column-lineage tests appear to cover standard dbt translation paths, but I did not find coverage for
.fetch_column_metadata() + DBTProjectComponent + custom translated keys.
It seems like lineage construction should derive the upstream key from the translated asset spec (or otherwise use the same translation path as asset definition), rather than calling get_asset_key() directly.
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
What's the issue?
When using
.fetch_column_metadata()together withDBTProjectComponent, Dagster can emit incorrect upstreamAssetKeys in column lineage metadata.The lineage builder in
_build_column_lineage_metadata()resolves upstream dependencies by callingdagster_dbt_translator.get_asset_key(parent_resource_props)indagster_dbt/core/dbt_cli_event.py. However, when assets are defined throughDBTProjectComponent, the effective translation logic is applied throughDbtProjectComponentTranslator.get_asset_spec(), not throughget_asset_key().Because of that mismatch, the upstream
AssetKeys recorded in column lineage metadata can differ from the actual translated asset keys registered in the asset graph. The result is that Dagster emits upstream column references that point to non-existent assets, so column lineage does not resolve correctly.Relevant references:
_build_column_lineage_metadata()callsdagster_dbt_translator.get_asset_key(parent_resource_props):python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_cli_event.py.fetch_column_metadata()builds lineage through_build_column_lineage_metadata():python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_event_iterator.pyDbtProjectComponentTranslatorapplies component translation inget_asset_spec():python_modules/libraries/dagster-dbt/dagster_dbt/components/dbt_project/component.pyDagsterDbtTranslator.get_asset_spec()usesspec.key, whileget_asset_key()remains the default asset-key path:python_modules/libraries/dagster-dbt/dagster_dbt/dagster_dbt_translator.pyWhat did you expect to happen?
I expected column lineage to use the same translated upstream
AssetKeys thatDBTProjectComponentuses when defining assets, so lineage metadata would point to real upstream assets in the asset graph.How to reproduce?
DBTProjectComponent..fetch_column_metadata().AssetKeys instead of the translated keys actually present in the asset graph.This causes lineage entries to point to non-existent upstream assets.
Dagster version
1.13.3
Deployment type
Dagster Cloud
Deployment details
No response
Additional information
I traced this to the interaction between column-lineage generation and component-based translation:
_build_column_lineage_metadata()constructsTableColumnDepentries usingdagster_dbt_translator.get_asset_key(parent_resource_props).DbtProjectComponentTranslatorapplies custom translation throughget_asset_spec(), including user-provided component translation that can change the final asset key.DbtProjectComponentTranslatormust be preserved in asset spec metadata so execution uses the component translator rather than the base translator:python_modules/libraries/dagster-dbt/dagster_dbt_tests/components/test_dbt_project_component.py.fetch_column_metadata()+DBTProjectComponent+ custom translated keys.It seems like lineage construction should derive the upstream key from the translated asset spec (or otherwise use the same translation path as asset definition), rather than calling
get_asset_key()directly.Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.