ReOrc docs
Get ReOrc
English
English
  • About ReOrc
  • Set up and deployment
    • Set up organization
    • Install ReOrc agent
  • Getting started
    • 1. Set up a connection
      • BigQuery setup
    • 2. Create a project
    • 3. Create data models
    • 4. Build models in console
    • 5. Set up a pipeline
  • Connections
    • Destinations
      • Google Service Account
    • Integrations
      • Slack
  • Data modeling
    • Overview
    • Sources
    • Models
      • Model schema
      • Model configurations
    • Jinja templating
      • Variables
      • Macros
    • Materialization
    • Data lineage
    • Data tests
      • Built-in generic tests
      • Custom generic tests
      • Singular tests
  • Semantic modeling
    • Overview
    • Data Modelling vs Semantic Layer
    • Cube
      • Custom Dimension
      • Custom Measure
        • Aggregation Function
        • SQL functions and operators
        • Calculating Period-over-Period Changes
      • Relationship
    • View
      • Primary Dimension
      • Add Shared Fields
    • Shared Fields
    • Integration
      • Guandata Integration
      • Looker Studio
  • Pipeline
    • Overview
    • Modeling pipeline
    • Advanced pipeline
    • Job
  • Health tracking
    • Pipeline health
    • Data quality
  • Data governance
    • Data protection
  • Asset management
    • Console
    • Metadata
    • Version history
    • Packages and dependencies
  • DATA SERVICE
    • Overview
    • Create & edit Data Service
    • Data preview & download
    • Data sharing API
    • Access control
  • AI-powered
    • Rein AI Copilot
  • Settings
    • Organization settings
    • Project settings
    • Profile settings
    • Roles and permissions
  • Platform Specific
    • Doris/SelectDB
Powered by GitBook
On this page
  • Key features of the config() macro
  • Configure incremental models
  • Platform-specific congfigurations
  1. Data modeling
  2. Models

Model configurations

PreviousModel schemaNextJinja templating

Last updated 15 days ago

While ReOrc provides the for basic model configuration, you can use the config() macro (provided by dbt) to implement more granular and powerful controls over your model's behavior. This macro allows you to define model-specific configurations directly in your SQL files, giving you precise control over how your models are built and materialized.

Key features of the config() macro

  1. Materialization control: The config() macro enables users to determine how a model will be materialized in the analytics database. Common materialization strategies include:

    • Table: The model is created as a table.

    • View: The model is created as a view.

    • Incremental: The model is built incrementally, allowing for efficient updates to large datasets.

  2. Granular control: Configurations set using the config() macro can inherit or override settings defined in other locations like model's metadata. Users can define various configurations within the config() macro, some of which include:

    • Setting unique keys for incremental models

    • Specifying incremental strategy

    • Specifying partition for large data volume

    • Configuring pre-hook and post-hook

The configuration should be placed at the beginning of model script:

{{ config(
    materialized='incremental',
    unique_key='id'
) }}

SELECT *
FROM {{ ref('source_table') }}

Configure incremental models

One popular usage of config() is to define the incremental strategy for data models.

Incremental is materialization strategy designed to efficiently update tables in a data warehouse by only transforming and loading new or changed data since the last run. This approach significantly reduces the time and resources required for data transformations, making it especially useful for large datasets.

In a model's metadata panel, you can specify the materialization option to be incremental. By default, this option results in append-only behavior, where new data is simply added to the table with each build. Therefore, you must specify the following configurations:

  • unique_key: Defines the unique identifier for records in the model, which helps dbt determine whether to insert new records or update existing ones.

  • incremental_strategy: Specifies how dbt should handle incremental updates. Common strategies include:

    • merge: Updates existing records and inserts new ones based on the unique key.

    • append: Simply adds new records without updating existing ones.

    • insert_overwrite: Overwrites existing records with new data based on specified conditions.

  • is_incremental(): This macro is essential for filtering which rows should be processed during an incremental run. It allows you to specify conditions to select only new or updated records since the last execution.

For example, in the orders table, we can implement incremental materialization to process only new or updated orders since the last run, reducing the processing time and ensuring the table remains up-to-date:

{{ config(
    materialized='incremental',
    unique_key='id',
    incremental_strategy='merge'
) }}

select * 
from {{ source("public", "raw_orders") }}
{% if is_incremental() %} -- only process new records
    where ordered_at >= (select max(ordered_at) from {{ this }})
{% endif %}

Platform-specific congfigurations

As different database platforms use varying approaches to optimize data processing, some configurations are designed and applied specifically for each platform. For detailed information on these platform-specific configurations and behaviors, refer to

metadata panel
PLATFORM SPECIFIC