dbt-core This Week: DuckDB Catalog Stack and Warehouse Adapter Fixes
dbt-core landed 56 commits on main this week, touching 464 files. The headline work is a two part DuckDB catalog stack that routes models through DuckLake, local filesystem, Iceberg REST, Horizon, and Unity catalogs with explicit write strategies. Warehouse adapters picked up fixes that matter in production, and state:modified got less noisy for seeds.
DuckDB catalog stack and write strategies
The biggest chunk of activity sits in the DuckDB adapter. Part 1 introduced the catalogs.yml v2 model, catalog scoped metadata listing, and Jinja materializations that branch on a single duckdb_write_strategy value instead of nested booleans. DuckLake catalogs now set attached_database so +catalog_name routes models to the attached catalog rather than silently landing in the built in profile database.
Part 2 lifts Horizon and Unity from read only to read write attach. Horizon and Unity catalogs require DuckDB 1.5.4. The commit adds duckdb iceberg write compat keys to the config allow list: stage_create_tables, disable_multi_table_commit, skip_create_table_metadata_updates, remove_files_on_delete, and default_region.
The write path logic lives in catalog_relation.rs. A new DuckDbWriteStrategy enum collapses three paths:
create_as_selectfor plain DuckDB tables and DuckLakedirect_createfor Iceberg catalogs that use emptyCREATEplusINSERT(Iceberg REST does not supportALTER ... RENAME)direct_create_as_selectwhenstage_create_tables: trueopts into staged creates
Horizon rejects staged creates by default, so the FSM defaults there to direct_create. Unity gets DISABLE_MULTI_TABLE_COMMIT when the user does not set it. Materializations read duckdb_write_strategy from Jinja and skip the temp table rename dance on Iceberg paths where rename is impossible.
# catalogs.yml excerpt (Horizon write compat keys)
config:
duckdb:
stage_create_tables: false
disable_multi_table_commit: true
If you run DuckDB catalogs today, plan on upgrading to 1.5.4 before turning on Horizon or Unity writes.
Warehouse adapter fixes operators will notice
Several commits fix footguns that show up outside local dev.
Redshift cross database models now switch database scope per node via USE <database> and RESET USE in adapter_impl.rs. The adapter gates those calls to runtime Redshift and drops the thread local connection if reset fails, which avoids leaving a connection stuck on the wrong database.
BigQuery schema DDL now qualifies schema names with the target project in create_schema and drop_schema. When execution_project differs from the target project, unqualified CREATE SCHEMA IF NOT EXISTS my_schema ran against the execution project and created datasets in the wrong place. The fix renders target-project.schema so DDL hits the intended project regardless of where queries bill.
Other adapter notes from the same window:
- Databricks incremental constraint SQL errors fixed in the shared adapter layer
- Spark Connect no longer requires a
userfield in connection config - Agate struct column conversion stops panics on map of struct columns during result handling
state:modified and manifest parity
Selective builds depend on state:modified not crying wolf. A cluster of seed config fixes in nodes.rs compares unrendered configs instead of rendered values when Jinja makes the same project look different per target.
Seeds with environment aware column_types in dbt_project.yml no longer get flagged as modified when the unrendered Jinja is identical. Related commits extend the same pattern to meta and quote_columns on seeds.
Operation nodes now carry alias, database, and schema in the manifest. Test unique_id parity for singular and schema tests keeps dbt core and Mantle aligned on identifiers. Teams running dbt build --select state:modified+ against a cached manifest should see fewer false positives on seed config that varies only by rendered target output.
dbt docs server v2 API surface
The embedded docs server gained endpoints the v2 UI needs to stop hardcoding behavior.
GET /api/v1/capabilities now exposes has_dbt_state, resolved at startup from DBT_ENGINE_MANAGE_STATE and flags.manage_state in dbt_project.yml. GET /api/v1/identity returns is_logged_in (always false for now, JWT deferred) and analytics_enabled based on DO_NOT_TRACK. A follow up commit resolves send_anonymous_usage_stats consent for dbt docs serve.
Handler test files under crates/dbt-docs-server/src/handlers/ saw broad updates as AppState picked up has_dbt_state and do_not_track. The UI bundle synced to dbt-ui@862e5855aaa4 in a separate chore commit.
What to watch
Preview 2.0.0-preview.193 shipped with a version bump and changelog. DuckDB Horizon and Unity writes need DuckDB 1.5.4 and correct stage_create_tables settings per catalog type. BigQuery projects that split execution and target projects should replay adapter tests after pulling the schema qualification fix.
If you rely on selective builds, rerun a state:modified dry run after upgrading. The unrendered config comparisons only help when manifests carry unrendered_config; older artifacts may still fall back to rendered comparison. More catalog typed fields are planned in catalog_relation.rs TODOs, so expect further enum migrations beyond DuckDB.