columns
- Models
- Sources
- Seeds
- Snapshots
- Analyses
version: 2
models:
- name: <model_name>
columns:
- name: <column_name>
data_type: <string>
description: <markdown_string>
quote: true | false
data_tests: ...
config:
tags: ...
meta: ...
- name: <another_column>
...
version: 2
sources:
- name: <source_name>
tables:
- name: <table_name>
columns:
- name: <column_name>
description: <markdown_string>
data_type: <string>
quote: true | false
data_tests: ...
config:
tags: ...
meta: ...
- name: <another_column>
...
version: 2
seeds:
- name: <seed_name>
columns:
- name: <column_name>
description: <markdown_string>
data_type: <string>
quote: true | false
data_tests: ...
config:
tags: ...
meta: ...
- name: <another_column>
...
version: 2
snapshots:
- name: <snapshot_name>
columns:
- name: <column_name>
description: <markdown_string>
data_type: <string>
quote: true | false
data_tests: ...
config:
tags: ...
meta: ...
- name: <another_column>
version: 2
analyses:
- name: <analysis_name>
columns:
- name: <column_name>
description: <markdown_string>
data_type: <string>
- name: <another_column>
Columns are not resources in and of themselves. Instead, they are child properties of another resource type. They can define sub-properties that are similar to properties defined at the resource level:
tags
meta
data_tests
description
Because columns are not resources, their tags
and meta
properties are not true configurations even when nested under a config
block. They do not inherit the tags
or meta
values of their parent resources. However, you can select a generic test, defined on a column, using tags applied to its column or top-level resource; see test selection examples.
Columns may optionally define a data_type
, which is necessary for:
- Enforcing a model contract
- Use in other packages or plugins, such as the
external
property of sources anddbt-external-tables
quote
The quote
field can be used to enable or disable quoting for column names.
- Models
- Sources
- Seeds
- Snapshots
- Analyses
version: 2
models:
- name: model_name
columns:
- name: column_name
quote: true | false
version: 2
sources:
- name: source_name
tables:
- name: table_name
columns:
- name: column_name
quote: true | false
version: 2
seeds:
- name: seed_name
columns:
- name: column_name
quote: true | false
version: 2
snapshots:
- name: snapshot_name
columns:
- name: column_name
quote: true | false
version: 2
analyses:
- name: analysis_name
columns:
- name: column_name
quote: true | false
Default
The default quoting value is false
Explanation
This is particularly relevant to those using Snowflake, where quoting can be particularly fickle.
This property is useful when:
- A source table has a column that needs to be quoted to be selected, for example, to preserve column casing
- A seed was created with
quote_columns: true
(docs) on Snowflake - A model uses quotes in the SQL, potentially to work around the use of reserved words
select user_group as "group"
Without setting quote: true
:
- Data tests applied to this column may fail due to invalid SQL
- Documentation may not render correctly, e.g.
group
and"group"
may not be matched as the same column name.
Example
Add data tests to a quoted column in a source table
This is especially relevant if using Snowflake:
version: 2
sources:
- name: stripe
tables:
- name: payment
columns:
- name: orderID
quote: true
data_tests:
- not_null
Without quote: true
, the following error will occur:
$ dbt test -s source:stripe.*
Running with dbt=0.16.1
Found 7 models, 22 tests, 0 snapshots, 0 analyses, 130 macros, 0 operations, 0 seed files, 4 sources
13:33:37 | Concurrency: 4 threads (target='learn')
13:33:37 |
13:33:37 | 1 of 1 START test source_not_null_stripe_payment_order_id............ [RUN]
13:33:39 | 1 of 1 ERROR source_not_null_stripe_payment_order_id................. [ERROR in 1.89s]
13:33:39 |
13:33:39 | Finished running 1 tests in 6.43s.
Completed with 1 error and 0 warnings:
Database Error in test source_not_null_stripe_payment_order_id (models/staging/stripe/src_stripe.yml)
000904 (42000): SQL compilation error: error line 3 at position 6
invalid identifier 'ORDERID'
compiled SQL at target/compiled/jaffle_shop/schema_test/source_not_null_stripe_payment_orderID.sql
This is because dbt is trying to run:
select count(*)
from raw.stripe.payment
where orderID is null
Instead of:
select count(*)
from raw.stripe.payment
where "orderID" is null