Workload Run¶
A Workload Run is created when a particular Workload Job Definition is triggered for a Run. The Workload Run page provides the overall execution status of a Workload Job Definition with detailed information on each Run. A Workload Job Definition that has only Data Sources returns the Data Source loading history and overall Workload Job Definition's status, while a system-defined Workload Job Definition returns the Data Source loading history and the status of the related Data Pipeline. This is one of the main functions of the Data Pipeline service.
A Workload Job Definition can have multiple Workload Runs (based on the trigger).
What happens during a Workload Run?
- Data Sources are loaded into the destination Data Lake (Read More).
- Once completed, Data Pipeline service is triggered and initiates the Argo Workflow - where applicable (Read More).
- Once the process is completed, Workload Run is updated with the relevant statuses/ details.
Workload Run page includes,
1.Workload Run summary
2.Data Sources - This indicates the load status and the destination Data Lake of each connected Data Source.
3.Actions - This indicates the Action execution history.
4.View Logs option to navigate to the Workload Log page.
5.View Workload option to navigate back to the related Workload Job Definition page.
1.Workload Run summary
Data Properties:
Property | Description |
---|---|
Workload Name | Name of the respective Workload Job Definition. |
Start Time | Specifies the timestamp when the Workload Job Definition starts Loading. |
End Time | Specifies the timestamp when the execution of the Workload Job Definition is completed during each Run. |
Status | This is the execution status of the Workload Job Definition, based on the Workload Run. |
Error Message | Error message related to a Workload Run. |
Workload Job Definition - Status transition
Workload Job Definition -starting Status | Transitioned Status | Context |
---|---|---|
New | Loading | New Workload Job Definition's connected Data Sources are loading. |
Loading | Loading Error | Error while loading connected Parquet Data Sources. |
Loading | Executing | When Parquet Data Source loading is completed, Workload Job Definition's execution is triggered by the scheduler. |
Executing | Executed | Workload Job Definition is executed successfully and Data Source load history is updated. |
Executing | Executed with Errors | Workload Job Definition execution was not successful. Scenario 1: All Data Sources are loaded successfully, but a Workflow execution error occurred. Scenario 2: Data Sources are not defined in the Workload Job Definition. Scenario 3: Some Data Sources are not loaded successfully therefore, an execution error occurred. |
2.Data Sources Tab
Indicates the load history of each Data Source that's included in the Workload Job Definition.
Data Properties:
Property | Description |
---|---|
Data Source | This is the name of the Parquet Data Source that is included in the relevant Workload Job Definition. |
Queue Time | This timestamp indicates when the Parquet Data Source loading job is pushed into the queue. |
Load Start Time | The timestamp the Data Pump started loading the related Data Source. |
Load End Time | The timestamp when the Data Pump completes the loading of the Parquet Data Source. |
Load Status | This specifies the value based on the transitions during the Data Source loading process. |
Destination | This specifies the destination Data Lake into which the respective Data Source is Loaded. |
3.Actions Tab- Indicates the Action run history
Data Properties:
Property | Description |
---|---|
Name | The identifier of the specific Workflow. |
Type | Specifies the Workload Action type in a Workload Job Definition. |
Instance Name | Unique name of the Argo Workflow that is started for the relevant Action. |
Start Time | Specifies the timestamp when a request is invoked to start the Workflow. |
Run Start Time | Specifies the time when the Workflow status goes from 'Pending' to 'Running', after the request is invoked. |
End Time | This specifies the end time of the Workflow Run. |
Status | Transitioned status of a Workflow during each Workload Action Run. |
Spark Log | The log from the executed Apache Spark job received from the Data Pipeline service, when getting the logs for an executed Workflow. |
Workload Action Status transition
Workload Action starting status | Transitioned Status | Context |
---|---|---|
Pending | Running | Workflow running is initiated. |
Running | Succeeded | Workflow running completed successfully. |
Running | Error | Workflow running completed with errors. |
How to navigate¶
Navigation to the Workload Run page can be done in the below way:
Using View Runs / Details option from Workload Job Definitions page and the Workload Job Definition page will display Workload Runs based on the applied filters as noted below.
Selection | Results in Workload Run Page |
---|---|
View Runs option from Workload Job Definitions page and Workload Job Definition page (page level) | Runs are filtered based on the Workload ID |
Details option from the Workload Runs section of the Workload Job Definition page | Runs are filtered based on the Workload Run ID |