Documentation/Scheduling/Batch Schedules and Jobs

From ScriptRunner
Revision as of 19:36, 8 October 2020 by Shaun.mccreery (talk | contribs)
Jump to: navigation, search


Documentation :: Scheduling :: Batch Schedules

Batch Schedules allow multiple scripts to be executed in a specific order at predefined times. This can be useful when certain scripts depends on the successful completion of another, especially when these scripts are distributed across a number of hosts.

Batch Schedule Jobs can be have any number of dependencies on other scripts being completed, and additionally be set to run at specific times or on the presence, absence or content of a file.

Configuration and Usage




Batch Instances

A Batch Instance is a copy of a Batch Schedule used to run the Batch Schedules and their jobs.

At the Batch Generation Time ( configured in Application Configuration ), the Batch Controller goes through all Batch Schedules and Batch Schedule Jobs and makes a copy of them as a Batch Instance, setting the status of these to BUILDING.

The Batch Controller will then go through the newly created Batch Instance and join all the script dependencies together. Any Batch Instance Jobs that have dependencies on other jobs completing will be updated to have an AWAITING_DEPENDENCY status. The remaining jobs will then be set to a status of READY. At this point, the Batch Instance will start running.

Batch Controller

If running Batch Schedules, you will need to designate one of your Script Hosts to be the Batch Controller. Only one host can be a Batch Controller at any given time.

The Batch Controller is responsible for generating the Batch Instances created every day, and for managing the dependencies between all Batch Instance Jobs. If this host is offline Batch Instances may not progress.

Example Schedule

This example tries to demonstrate a simple use case of using a Batch Scheduled Script. In this example we have three scripts:

- housekeeping.sh
- ETL.sh
- cacheClear.sh [website]

Every night, we need to run housekeeping.sh, followed by ETL.sh, and finally cacheClear.sh twice with different arguments. This would give a schedule flow that looks something like below.

Example Batch Schedule

                                                 ==============================
                                                 +                            +
                                           -->   + cacheClear.sh website1.com +
===================         ============         +          (host 1)          +
+                 +         +          +         ==============================
+ housekeeping.sh +   -->   +  ETL.sh  +      
+    (host 1)     +         + (host 2) +         ==============================
===================         ============         +                            +
                                           -->   + cacheClear.sh website2.com +
                                                 +          (host 1)          +
                                                 ==============================


This schedule ensures that ETL.sh is not run before housekeeping.sh is finished, and both runs of cacheClear.sh do not run until ETL.sh is completed, and enables both cacheClear.sh runs to happen in parallel. All of this can easily happen despite these scripts running on different hosts.

To do this, first we need to create the Batch Schedule to run this under. From the menu, select Schedules > Batch Schedules > Manage Batch Schedules, and then click NewButton.png.

CreateBatchScheduleExample1.png

We can fill in this schedule as shown above.

Once this is created, we can then add our scripts, referred to as Batch Schedule Job. From the menu, select Schedules > Batch Schedules > Manage Batch Schedule Jobs, and click NewButton.png.

CreateBatchScheduleJobExample1Job1.png

We can create the first job as shown above. We have set this to run at 23:00 every night. Once this is created, we can then add the second job.

CreateBatchScheduleJobExample1Job2.png

As shown above, this time we need to set the first script as a predecessor job of this script. This is what ensures that ETL.sh does not start until housekeeping.sh has successfully completed. The Allow Failures option allows ETL.sh to run even if housekeeping.sh reports it failed.

We can now create the jobs to run our cacheClear.sh script twice for both of our websites.

CreateBatchScheduleJobExample1Job3.png

CreateBatchScheduleJobExample1Job4.png

In this example, we have also set the cacheClear.sh jobs to not start before 00:05 on the following day, as shown by the Start Time and Day Offset. If ETL.sh completes before this time, these jobs will wait until 00:05 to run.

The example Batch Schedule is now set up and will be ready to start at 23:00.