ReOrc docs
Get ReOrc
English
English
  • About ReOrc
  • Set up and deployment
    • Set up organization
    • Install ReOrc agent
  • Getting started
    • 1. Set up a connection
      • BigQuery setup
    • 2. Create a project
    • 3. Create data models
    • 4. Build models in console
    • 5. Set up a pipeline
  • Connections
    • Destinations
      • Google Service Account
    • Integrations
      • Slack
  • Data modeling
    • Overview
    • Sources
    • Models
      • Model schema
      • Model configurations
    • Jinja templating
      • Variables
      • Macros
    • Materialization
    • Data lineage
    • Data tests
      • Built-in generic tests
      • Custom generic tests
      • Singular tests
  • Semantic modeling
    • Overview
    • Data Modelling vs Semantic Layer
    • Cube
      • Custom Dimension
      • Custom Measure
        • Aggregation Function
        • SQL functions and operators
        • Calculating Period-over-Period Changes
      • Relationship
    • View
      • Primary Dimension
      • Add Shared Fields
    • Shared Fields
    • Integration
      • Guandata Integration
      • Looker Studio
  • Pipeline
    • Overview
    • Modeling pipeline
    • Advanced pipeline
    • Job
  • Health tracking
    • Pipeline health
    • Data quality
  • Data governance
    • Data protection
  • Asset management
    • Console
    • Metadata
    • Version history
    • Packages and dependencies
  • DATA SERVICE
    • Overview
    • Create & edit Data Service
    • Data preview & download
    • Data sharing API
    • Access control
  • AI-powered
    • Rein AI Copilot
  • Settings
    • Organization settings
    • Project settings
    • Profile settings
    • Roles and permissions
  • Platform Specific
    • Doris/SelectDB
Powered by GitBook
On this page
  1. Pipeline

Overview

PreviousLooker StudioNextModeling pipeline

Last updated 15 days ago

The Pipeline module in ReOrc enables you to orchestrate data workflows efficiently. Rather than manually running models to retrieve results, you can define pipelines that automate the entire data transformation process. Pipelines represent your data workflows, from which you can schedule periodic runs, manage dependencies between models, and monitor workflow status in real time.

In ReOrc, pipeline is a another type of asset that can contain multiple models. When you create a pipeline, ReOrc automatically queries the relationship between models - derived from data lineage - and visualizes them in a directed acyclic graph (DAG).

The Pipeline module is designed to provide:

  • A high-level view: Pipelines provide an accessible, high-level visualization of the data workflow, allowing data practitioners and stakeholders to easily understand the workflow's objectives and outcomes.

  • End-to-end validation: Pipelines automatically trigger data tests for models and assets during each run as a form of regression analysis. This helps you identify any issues introduced when chaining together several transformations.

  • Simplified troubleshooting: Results and logs are instantly displayed in the Pipeline Health dashboard, where you can inspect details of each task to tfarace errors and resolve issues efficiently.

Key features

The module currently supports the following features:

: Create pipelines from data models.

: Configure job and schedule run from pipelines.

Modeling pipeline
Job