#535 Postgres style indexes by Benjamin-Knight · Pull Request #714 · dbt-msft/dbt-sqlserver

Benjamin-Knight · 2026-06-16T12:02:26Z

Based off @axellpadilla original work, been using myself for a couple of weeks so appears stable, I was about to base this on the #710 work due to the transactions and online/resumable indices but instead have made it independent of that and just moved them out of the transaction completely.

Postgres-style indexes model config for SQL Server

Closes #535.

Adds a declarative indexes config (modelled on dbt-postgres) so indexes can
be defined in a model/seed/snapshot config block instead of hand-written
post-hooks. dbt owns the full lifecycle: it creates them on build,
reconciles them against the config on subsequent runs, and never touches
indexes it didn't create.

Usage

  {{ config(
      materialized="table",
      as_columnstore=False,
      indexes=[
        {"columns": ["id"], "type": "clustered", "unique": True},
        {"columns": ["created_at", {"column": "status", "desc": True}],
         "included_columns": ["amount"], "where": "status is not null"},
        {"columns": ["lookup_key"], "build_options": {"online": True}},
      ]
  ) }}

What's included

Config surface — most of CREATE INDEX is expressible per entry:

columns (ascending, or {"column": name, "desc": true}), included_columns,
unique
type: clustered / nonclustered / columnstore
data_compression (row/page, or columnstore/columnstore_archive), where
(filtered indexes), fillfactor, pad_index, ignore_dup_key,
optimize_for_sequential_key
build_options for build-time-only settings (online, resumable, maxdop,
sort_in_tempdb, lock options, …) — these affect how the index is built, not
what it is, so they never trigger a rebuild
Cross-config validation (e.g. at most one clustered index; clustered
rowstore vs as_columnstore conflict), surfaced on the first build rather
than mid-run

Deterministic naming & idempotency — managed index names are a hash of the
full definition (dbt_idx_ prefix). Name equality ⇔ definition equality, so
creation is idempotent and an unchanged definition is never rebuilt.

Reconciliation — on the paths where a relation persists across runs
(incremental non-full-refresh, DML table refresh, snapshot updates), the
index set is converged on the config in one atomic batch:
dropped-then-recreated definitions, with no window where a replacement index
is missing. Constraint-backing indexes, legacy post-hook
(clustered_/nonclustered_) indexes, and the as_columnstore CCI are never
dropped. drop_unmanaged_indexes (false (default) / warn / true) controls
treatment of indexes dbt didn't create.

Transaction-safe — the reconcile batch is @@TRANCOUNT-aware: it owns its own
transaction under autocommit (the default) and joins an existing one if dbt
is managing the model transaction, so it stays correct either way.
online/resumable indexes are built in the materialization's post-commit
phase, since SQL Server forbids those operations inside a transaction; these
builds are idempotent but not atomic with the model.

Lock-free introspection — index introspection reads the catalog with NOLOCK
on every sys view, so it doesn't queue behind concurrent index DDL.

Limitations

Minimum supported SQL Server version is 2017 (STRING_AGG ... WITHIN
GROUP).
Partitioning, XML_COMPRESSION, and ordered columnstore are not yet
expressible.
online/resumable build options are not supported on seed models when dbt
manages the transaction — dbt-core's seed materialization isn't
adapter-owned and has no post-commit hook (documented in the changelog).

…tests for index configs

test_table_index_names_stable_across_runs is xfail(strict): timestamp-hashed index names regenerate on every rebuild, which blocks name-based reconciliation. test_table_definitions_stable_across_runs pins that the definition set does not accumulate today (table rebuilds are always fresh). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

render() now hashes the full index definition (ordered columns, sorted included columns, relation identity, unique, type) under a dbt_idx_ prefix. SQL Server index names are per-table scoped, so the Postgres rename collision (dbt-core#1945) the timestamp guarded against cannot occur here; determinism is required so name equality <=> definition equality for reconciliation. Also removes the leftover debug print. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sqlserver__get_create_index_sql now guards with IF NOT EXISTS (names are deterministic full-definition hashes, so an existing index with the same name is already the desired index) and bracket-quotes the index name, key columns and INCLUDE columns so reserved-word column names work. Adds a reserved-word functional test and consolidates the name/definition stability assertions into one two-build test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

data_compression (none|row|page, rowstore only) participates in the definition hash and equality, so changing it produces a new name and a rebuild on reconcile; 'none' normalizes to absent so it hashes like the engine default. sort_in_tempdb is a build-time option: excluded from hash and equality (not introspectable, doesn't change the resulting index) so toggling it never triggers drop/recreate. Both render into a WITH clause on CREATE INDEX; both rejected for columnstore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

index_config_changes() computes drops-before-creates from describe rows vs expected configs using name-set arithmetic (deterministic names make name equality <=> definition equality). Hard exclusions in every mode: constraint-backing indexes and legacy post-hook (clustered_/nonclustered_) names. drop_unmanaged_indexes false|warn|true gates other unmanaged indexes, with bool normalization for YAML values. A clustered index in config blocked by an existing non-droppable clustered index raises a clear error naming the blocker instead of failing later in CREATE. Also fixes parse_relation_results: strips whitespace from aggregated column lists, parses NULL/empty includes to empty sets, coerces unique to bool, and carries data_compression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

New macros: sqlserver__describe_indexes (sys.indexes + STRING_AGG key/ include columns, compression, constraint flags; SQL Server 2017+), sqlserver__get_drop_index_sql, and sqlserver__reconcile_indexes, which diffs existing indexes against the indexes config via adapter.index_changes and applies drops before creates with info-level logging. Wired into the three paths where the relation persists across runs: incremental non-full-refresh, table_refresh_method=dml pure-DML (after the swap transaction), and snapshot updates. drop_unmanaged_indexes config (false|warn|true) gates sweeping of unmanaged indexes; constraint-backing indexes, legacy post-hook names and the as_columnstore CCI are never dropped in any mode. as_node_config now emits JSON-compatible types so reconcile creates can round-trip through parse_index (latent bug in the original PR). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

adapter.validate_indexes rejects more than one clustered entry and a clustered rowstore entry combined with as_columnstore=true (the default), replacing the opaque engine error 'Cannot create more than one clustered index' with an actionable message. Called from a new sqlserver__create_indexes override and from sqlserver__reconcile_indexes. as_columnstore is only weighed for materializations that honor it (table/incremental/snapshot) - seeds never build a CCI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Coverage baseline on relation_configs (99%, single defensive re-raise uncovered) exposed that dbtClassMixin/mashumaro silently replaces any from_dict defined in the class body with a generated method - our normalization override was dead code, so data_compression: PAGE/NONE (uppercase) failed validation instead of normalizing. Normalization now happens in parse(), the raw-config entry point, and the dead override is removed with an explanatory note. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dbt fires anonymous-usage telemetry and a PyPI version check on every invocation; under the test harness that is one to three in-process run_dbt() calls per functional test. On sandboxed or offline networks (devcontainers, some CI runners) those requests stall with variable timeouts - measured 21-95s per dbt invocation at ~zero CPU - which inflated the functional suite from ~17s to ~96s per test. Even with a working network they are wasted cycles in a test run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

New fields: per-column direction ({'column': n, 'desc': true} entries, normalized into descending_columns), where (filtered indexes, nonclustered and columnstore), fillfactor/pad_index, ignore_dup_key (unique rowstore), optimize_for_sequential_key (rowstore, 2019+), and columnstore / columnstore_archive data_compression values for columnstore indexes. All definition fields join the name hash ONLY when set, so existing managed index names stay stable across the upgrade. Normalization now runs before jsonschema validation in parse() so dict column entries validate. With this, most CREATE INDEX situations are expressible in config rather than falling back to post-hooks, keeping definitions machine-readable for the planned prebuilt feature. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

online, maxdop, resumable, max_duration, allow_row_locks, allow_page_locks, statistics_norecompute, statistics_incremental and compression_delay flow through to the WITH clause. Excluded from the name hash and from equality: they affect how an index is built, not what it is, so toggling them must never trigger a drop/recreate on reconcile. Unknown keys fail validation with the allowed list. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Create macro: per-column DESC, ]] identifier escaping, WHERE clause for filtered indexes, and a full WITH assembly (data_compression, fillfactor, pad_index, ignore_dup_key, optimize_for_sequential_key, sort_in_tempdb, plus build_options passthrough). Describe macro: descending key columns, filter_definition, fill_factor, ignore_dup_key, NCCI compression; MAX() instead of TOP 1 on sys.partitions so mixed-compression partitions can't make reconcile thrash; optimize_for_sequential_key deliberately not selected (catalog column is 2019+, comparisons are name-based). Validation addition found by the functional fixture: the engine forbids IGNORE_DUP_KEY on filtered indexes, so ignore_dup_key x where now fails fast at parse time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Reconcile drops+creates now execute as ONE transactional batch (set xact_abort on) so a definition change is atomic: no window where a replacement index or its uniqueness enforcement is missing, on any of the three reconcile surfaces. - drop_unmanaged_indexes is validated in validate_indexes (shared normalize_drop_unmanaged), so a bad value fails the FIRST build instead of surfacing only when reconciliation later runs. - SQLServerIndexConfigChange's create-with-no-columns rule compared a tuple to set() (always False, dead); now uses truthiness. - New tests: DML no-churn on unchanged reruns; project-level dbt_project.yml models +indexes apply and model-level config clobbers them; snapshot with clustered index rejects default as_columnstore; invalid drop_unmanaged_indexes fails on first build. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

describe_indexes only hinted sys.indexes with NOLOCK; the OUTER APPLY reads of sys.index_columns, sys.columns and sys.partitions took shared locks, so introspection queued on LCK_M_S behind concurrent index DDL on the same table. Apply information_schema_hints() to every catalog read so the whole metadata query is lock-free, matching the rest of the file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Remove duplicate AdapterSpecificConfigs class attribute - Run ONLINE/RESUMABLE index creates outside the reconcile transaction (SQL Server forbids RESUMABLE in a user transaction; an ONLINE build wrapped in one holds its locks until commit, negating the non-blocking intent). index_changes now returns a separate creates_no_txn list. - Raise on name-hash collisions between distinct index configs instead of silently dropping one in the expected-by-name map - Remove dead parse_model_node (unused in production; not a framework hook) - Fix indexes config type hint Tuple[X] -> Tuple[X, ...] - black/isort formatting on test_index_macros.py Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…dependency) Unstacks the indexes feature from PR dbt-msft#710: the branch now builds on current master alone. The transaction-coexistence work is flag-agnostic, so it needs no dependency on dbt_sqlserver_use_dbt_transactions: - Index reconciliation runs in an @@TRANCOUNT-aware batch (owns its own transaction under autocommit, joins an open one if dbt manages the model transaction). - ONLINE/RESUMABLE indexes build in the materialization's post-commit phase via sqlserver__create_indexes_post_commit (auto_begin=False + @@TRANCOUNT guard), so they never run inside a transaction. Seeds documented as a known limitation. - index_needs_own_batch: single source of truth for the online/resumable split. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

LuuOW

Technical audit: Implementation verified for system consistency and engineering integrity.

axellpadilla and others added 23 commits June 4, 2026 15:40

Added to Ignore .vscode/ on .gitignore

7cbe513

Adding tests for new feature to use config for indexes

bfdb791

Added Basic Index configuration using dbt-postgresql reference

a5e4642

Refactor index and include tests for included columns

ceec769

Refactor index macro to include optional columns in SQL Server indexes

2b31238

fixed typing for python >=3.8, optimized with inmutables, added unit …

e6420a4

…tests for index configs

docs: changelog for the indexes feature set; SQL Server 2017 floor note

777664d

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Benjamin-Knight requested a review from axellpadilla June 16, 2026 12:02

axellpadilla added this to the v1.10.1rc1 milestone Jun 16, 2026

LuuOW reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#535 Postgres style indexes#714

#535 Postgres style indexes#714
Benjamin-Knight wants to merge 23 commits into
dbt-msft:masterfrom
Benjamin-Knight:feat/535-postgres-style-indexes

Benjamin-Knight commented Jun 16, 2026

Uh oh!

LuuOW left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Benjamin-Knight commented Jun 16, 2026

Postgres-style indexes model config for SQL Server

Usage

What's included

Limitations

Uh oh!

LuuOW left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants