SelectDB(Doris)
SelectDB(Doris) 配置项
config 语法格式:
{{ config(
materialized = "table",
duplicate_key = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
partition_type = "<engine-type>",
partition_by_init = ["<pertition-init>", ... ]
distributed_by = [ "<column-name>", ... ],
buckets = "int",
properties = {"<key>":"<value>",...}
...
]
) }}
在 Doris/SelectDB 上建模时,可使用的配置项如下:
materialized
How the model will be materialized into Doris. Must be table
to create a table model.
Required, can be defined in metadata
duplicate_key
The key list of Doris table model: 'duplicate'.
Required
partition_by
The partition key list of Doris.
Optional
partition_type
The partition type of Doris.
Optional (default: RANGE
)
partition_by_init
The partition rule or some real partitions item.
Optional
distributed_by
The bucket key list of Doris.
Required
buckets
The bucket number in one Doris partition.
Required
properties
The other configuration of Doris.
Required
增量模型
An incremental Doris table, item table model must be 'unique' and is configured using the following syntax:
Incremental table configuration
{{ config(
materialized = "incremental",
unique_key = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
partition_type = "<engine-type>",
partition_by_init = ["<pertition-init>", ... ]
distributed_by = [ "<column-name>", ... ],
buckets = "int",
properties = {"<key>":"<value>",...}
...
]
) }}
Available configurations:
materialized
How the model will be materialized into Doris. Must be table
to create a table model.
Required
partition_type
The partition type of Doris.
Optional (default: RANGE
)
partition_by_init
The partition rule or some real partitions item.
Optional
buckets
The bucket number in one Doris partition.
Required
最佳实践
Here's a sample model using Doris for incrementally update by partition:
{{
config(
materialized = "incremental",
unique_key = ["id"], -- Specify the unique key for the incremental model
partition_by = ["date"], -- Partition by the date column
partition_type = "RANGE", -- Specify the partition type (e.g., RANGE, HASH)
partition_by_init = ["2025-01-01"], -- Initial partition value
distributed_by = ["id"], -- Distribute by the id column
buckets = 10, -- Number of buckets for distribution
properties = {
"compression": "lz4", -- Example property for compression
"replication_factor": "2" -- Example property for replication factor
}
)
}}
SELECT
id AS order_id,
DATE(ordered_at) AS date,
order_total,
ordered_at,
NOW() AS updated_at
FROM
{{ source('ecom', 'raw_orders') }}
{% if is_incremental() %}
WHERE
DATE(ordered_at) = '{{ var("dt") }}' -- Process only current date's data
{% endif %}
查看更多用法,可参考 Doris 官方文档
Last updated