Skip to content

Bug: column lineage uses incorrect upstream AssetKeys with DBTProjectComponent #33856

@stevenayers

Description

@stevenayers

What's the issue?

When using .fetch_column_metadata() together with DBTProjectComponent, Dagster can emit incorrect upstream AssetKeys in column lineage metadata.

The lineage builder in _build_column_lineage_metadata() resolves upstream dependencies by calling dagster_dbt_translator.get_asset_key(parent_resource_props) in dagster_dbt/core/dbt_cli_event.py. However, when assets are defined through DBTProjectComponent, the effective translation logic is applied through DbtProjectComponentTranslator.get_asset_spec(), not through get_asset_key().

Because of that mismatch, the upstream AssetKeys recorded in column lineage metadata can differ from the actual translated asset keys registered in the asset graph. The result is that Dagster emits upstream column references that point to non-existent assets, so column lineage does not resolve correctly.

Relevant references:

  • _build_column_lineage_metadata() calls dagster_dbt_translator.get_asset_key(parent_resource_props):
    python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_cli_event.py
  • .fetch_column_metadata() builds lineage through _build_column_lineage_metadata():
    python_modules/libraries/dagster-dbt/dagster_dbt/core/dbt_event_iterator.py
  • DbtProjectComponentTranslator applies component translation in get_asset_spec():
    python_modules/libraries/dagster-dbt/dagster_dbt/components/dbt_project/component.py
  • DagsterDbtTranslator.get_asset_spec() uses spec.key, while get_asset_key() remains the default asset-key path:
    python_modules/libraries/dagster-dbt/dagster_dbt/dagster_dbt_translator.py

What did you expect to happen?

I expected column lineage to use the same translated upstream AssetKeys that DBTProjectComponent uses when defining assets, so lineage metadata would point to real upstream assets in the asset graph.

How to reproduce?

  1. Define dbt assets using DBTProjectComponent.
  2. Configure component translation so asset keys differ from the default dbt-to-Dagster translation.
  3. Materialize the assets with .fetch_column_metadata().
  4. Inspect the emitted column lineage metadata.
  5. Observe that upstream column dependencies reference default-derived AssetKeys instead of the translated keys actually present in the asset graph.

This causes lineage entries to point to non-existent upstream assets.

Dagster version

1.13.3

Deployment type

Dagster Cloud

Deployment details

No response

Additional information

I traced this to the interaction between column-lineage generation and component-based translation:

  • _build_column_lineage_metadata() constructs TableColumnDep entries using dagster_dbt_translator.get_asset_key(parent_resource_props).
  • DbtProjectComponentTranslator applies custom translation through get_asset_spec(), including user-provided component translation that can change the final asset key.
  • There are already tests asserting that DbtProjectComponentTranslator must be preserved in asset spec metadata so execution uses the component translator rather than the base translator:
    python_modules/libraries/dagster-dbt/dagster_dbt_tests/components/test_dbt_project_component.py
  • Existing column-lineage tests appear to cover standard dbt translation paths, but I did not find coverage for .fetch_column_metadata() + DBTProjectComponent + custom translated keys.

It seems like lineage construction should derive the upstream key from the translated asset spec (or otherwise use the same translation path as asset definition), rather than calling get_asset_key() directly.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions