ReOrc docs
Get ReOrc
English
English
  • About ReOrc
  • Set up and deployment
    • Set up organization
    • Install ReOrc agent
  • Getting started
    • 1. Set up a connection
      • BigQuery setup
    • 2. Create a project
    • 3. Create data models
    • 4. Build models in console
    • 5. Set up a pipeline
  • Connections
    • Destinations
      • Google Service Account
    • Integrations
      • Slack
  • Data modeling
    • Overview
    • Sources
    • Models
      • Model schema
      • Model configurations
    • Jinja templating
      • Variables
      • Macros
    • Materialization
    • Data lineage
    • Data tests
      • Built-in generic tests
      • Custom generic tests
      • Singular tests
  • Semantic modeling
    • Overview
    • Data Modelling vs Semantic Layer
    • Cube
      • Custom Dimension
      • Custom Measure
        • Aggregation Function
        • SQL functions and operators
        • Calculating Period-over-Period Changes
      • Relationship
    • View
      • Primary Dimension
      • Add Shared Fields
    • Shared Fields
    • Integration
      • Guandata Integration
      • Looker Studio
  • Pipeline
    • Overview
    • Modeling pipeline
    • Advanced pipeline
    • Job
  • Health tracking
    • Pipeline health
    • Data quality
  • Data governance
    • Data protection
  • Asset management
    • Console
    • Metadata
    • Version history
    • Packages and dependencies
  • DATA SERVICE
    • Overview
    • Create & edit Data Service
    • Data preview & download
    • Data sharing API
    • Access control
  • AI-powered
    • Rein AI Copilot
  • Settings
    • Organization settings
    • Project settings
    • Profile settings
    • Roles and permissions
  • Platform Specific
    • Doris/SelectDB
Powered by GitBook
On this page
  • Tasks
  • Create a job
  1. Pipeline

Job

PreviousAdvanced pipelineNextPipeline health

Last updated 5 months ago

To run a pipeline, you can create a job that describes how it should be executed. Job configuration can include:

  • Target environment: Choose between running the pipeline in development environment for testing, or applying it in production environment after thorough validation.

  • Schedule time: Select a scheduling strategy for the pipeline, or trigger it manually.

  • Variable settings: Configure values for variables referenced in the models

  • Notifications: Set up notifications for job failures via emails or third-party integrations.

Tasks

During a job run, every node is wrapped inside a task - the basic unit of work that represents the operation to be performed on the asset. Tasks are executed in a specific order based on the asset's dependencies, which are retrieved the upstream and downstream relationship in the .

Depending on the pipeline type, the functionality of the task may differ.

For modeling pipeline, each task performs two main steps:

  1. Materialize the asset: execute the transformation query specified in the asset.

  2. Run data tests: run the associated data tests of the asset.

Advanced pipeline can involves various types of operators and transformations, such as the TransferOperator for extracting and loading data, or SQLOperator for performing ad-hoc data transformation.

After a task has finished running, you can inspect the execution logs in .

By design, when a task fails, the entire job run is also marked as failed and all downstream tasks won't be executed.

Create a job

When you create a job from a pipeline, the job registers only the most recently published versions of the pipeline and associated models. To ensure that all changes are included, we recommend publishing the pipeline and models before creating the job.

Follow these steps to create a job:

  1. From the Pipelines tab, select a pipeline.

  2. Switch to the Jobs tab and click + Create a job.

  3. Provide the configuration for the job.

Provide the name, select the target environment, and customize the associated model variables.

Provide the schedule for the job and the trigger type:

  • Standard setup: trigger the run at a specific time between certain intervals.

  • Manually trigger: no scheduling; to be manually triggered in the Pipeline Health dashboard.

Here you can configure notifications on job failure. By default, notifications are sent by email.

  1. Click Create.

The new job will be displayed in the Jobs section.

Advanced setup: specify the schedule in crontab format, for those familiar with the cron scheduler. See: .

You can visit the dashboard to monitor the execution status of the job or, as a shortcut, click on the job name to navigate to Job details.

Cron
Pipeline Health
DAG
Task details