Rocoto

Workflow Management System

Download .zip Download .tar.gz View on GitHub

Introduction

Workflow Management is a concept that originated in the 1970's to handle business process management. Workflow management systems were developed to manage complex collections of business processes that need to be carried out in a certain way with complex interdependencies and requirements. Scientific Workflow Management is much newer, and is very much like its business counterpart, except that it is usually data oriented instead of process oriented. That is, scientific workflows are driven by the scientific data that "flows" through them. Scientific workflow tasks are usually triggered by the availability of some kind of input data, and a task's result is usally some kind of data that is fed as input to another task in the workflow. The individual tasks themselves are scientific codes that perform some kind of computation or retreive or store some type of data for a computation. So, whereas a business workflow is comprised of a diverse set of processes that have to be completed in a certain way, sometimes carried out by a machine, sometimes carried out by a human being, a scientific workflow is usually comprised of a set of computations that are driven by the availabiilty of input data.

Why Workflow Management?

The day when a scientist could conduct his or her numerical modeling and simulation research by writing, running, and monitoriing the progress of a modest Fortran code or two, is quickly becoming a distant memory. It is a fact that researchers now often have to make hundreds or thousands of runs of a numerical model to get a single result. In addition, each end-to-end "run" of the model often entails running many different codes for pre- and post-processing in addition to the model itself. And, in some cases, multiple models and their associated pre- and post-processing tasks are coupled together to build a larger, more complex model. The codes that comprise the end-to-end modeling systems often have complex interdependencies that dictate the order in which they can be run. And, in order to run the end-to-end system efficiently, concurrency must be used when dependencies allow it. The problem of scale and complexity is exacerbated by the fact that these codes are usually run on high performance machines that are notoriously difficult for scientists to use, and which tend to exhibit frequent failures. As machines get larger and larger, the failure rate of hardware and software components increases commensurately. Ad-hoc management of the execution of a complex modeling system is often difficult even for a single end-to-end run on a machine that never fails. Multiply that by the thousands of runs needed to perform a scientific experiment, in a hostile computing environment where hardware and facility outages are not uncommon, and you have a very challenging situation. For simulations that must run reliably in realtime, the situtation is almost hopeless. The traditional ad-hoc techniques for automating the execution of modeling systems (e.g. driver scripts, batch job chains or trees) do not provide sufficient fault tolerance for the scale and complexity of current and future workflows, nor are they reusable; each modeling system requires a custom automation system.

A Workflow Management System addresses the problems of complexity, scale, reliability, and reusability by providing two things:

A high-level means by which to describe the various codes that need to be run, along with their runtime requirements and interdependencies.
An automation engine for reliably managing the execution of the workflow

Prerequisites

Depending on how the components of a modeling system are designed and how existing software for running them is designed, some changes may be necessary to make use of a workflow management system. In order to take full advantage of the features offered by a workflow management system, the model system components must be well designed. In particular the following best practices should be followed:

Each workflow task must correctly check for its successful completion, and must return a non-zero exit status upon failure. An exit status of 0 means success, regardless of what actually happened. No workflow task should contain automation features. Automation is the workflow managmenet system's responsibility. A workflow management system cannot manage tasks or jobs that it is not aware of. Enable reuse of workflow tasks by using principles of modular design to build autonomous model components with well-defined interfaces for input and output that can be run stand-alone. Prefer the construction of small model components that do only one thing. It is easy to combine several small, well-designed, components together to build a larger, more complex workflow task. It is generally much more difficult to divide large, complex, model components into smaller ones to form multiple workflow tasks. Avoid combining serial and parallel processing in the same workflow task unless the serial processing is very short in duration.

How The Rocoto Workflow Management Engine Works

Our workflow managmement system, Rocoto, works differently than most other workflow management systems. It is designed to be a self-contained system that runs entirely in user space. That is, it is easy for end-users to install, and run without help from systems administrators. Rocoto interfaces to the local resource management system. It does not do any job scheduling itself, it merely submits jobs to the HPC system as task dependencies allow it. Local policies and users' allocations of HPC resources are enforced by the local resource management system (e.g. PBS, Torque, MOAB, SGE, LSF, etc.) so Rocoto has no control over when jobs start to run. Finally, Rocoto is designed to run weather and climate workflows; it is not a general purpose workflow management engine. Rocoto runs one instance of the workflow for a set of user-defined "cycles". A "cycle" usually corresponds to a model analysis or initialization time.

Describing Workflows

Rocoto currently supports one method for describing workflows. Other methods may be made available in the future. Presently, users must define their workflows using a simple, custom XML language. The custom XML language consists of a set of tags and attributes that define what tasks are to be run and what their interdependencies are. They also define the runtime requirements of the tasks, such as batch queueing options and environment variables, as well as automation controls. Details of the XML language are given below.

Executing Workflows

Rocoto is a Ruby program that interfaces to the underlying batch system. It does all the book keeping necessary to submit tasks when dependencies are satisfied and tracks the progress of the workflow. It will automatically resubmit failed tasks and can recover from system outages without human intervention.

The rocotorun command is used to "run" a workflow. And, as we describe below, this command must be executed many times in order to "run" a workflow to completion.

rocotorun -w /path/to/workflow/xml/file -d /path/to/workflow/database/file

The rocotorun command requires two arguments. The -w flag specifies the name of the workflow definition file. For now, this must be an XML file containing the workflow definition. The -d flag specifies the name of the database file that is to be used to store the state of the workflow. The database file is a binary file created and used only by Rocoto and need not exist prior to the first time the command is run.

It is very important to understand that the process of "running" a workflow is iterative. Each time the above command is executed, Rocoto performs the following actions:

Read the last known state of the workflow from the database file specified by the -d flag
Query the batch system to acquire the current state of the workflow
Take actions based on new state information. This includes things such as
1. Resubmit jobs that crashed
2. Submit jobs for tasks whose dependencies have just become satisfied
Save the current state of the workflow to the file specified by the -d flag
Quit

This means that each time Rocoto is run, there is an opportunity, but not a guarantee, for advancing the workflow's state toward completion. The amount of progress made each time Rocoto is run is nondeterministic and depends on many factors including the nature of the workflow being processed, load on the HPCS system, the batch system scheduler, delivery of data from external sources, etc. Rocoto typically runs for less than a minute each time it is invoked. It is important to understand that workflow state does not advance except when Rocoto is run; that is why it needs to be run repeatedly in order to run a workflow to completion. The primary advantage of this strategy of running workflows is that it is fault tolerant. Since each execution of the rocotorun command typically takes only a few seconds, the successful completion of a workflow does not hinge upon a machine being operational for the entire duration of the workflow (which could be days, months, years!). And, since the last known state of the workflow is saved on disk, rather than kept in volatile RAM, failure of the machine on which Rocoto is running does not require restarting the workflow from scratch or changing the workflow definition. Instead, as soon as the machine returns to operation, the workflow can be resumed from where it left off without having to modify the workflow definition at all. Further, if the database file is kept on a shared filesystem, the workflow can be resumed immediately using a different machine as long as that machine has access to the batch system.

Since the process of "running" workflows is iterative, it is usually necessary to run Rocoto as a cron job. Although running Rocoto repeatedly by hand is useful for testing and debugging, it is not tractable for production workflows. Rocoto should be run from cron at some regular interval, usually once every 2-10 minutes. The appropriate interval can be determined by looking at how long most jobs in the workflow take to run, and/or how crucial it is for new jobs to start as soon as their prerequisite jobs finish. A typical cron entry looks something like this.

*/5 * * * * /path/to/rocotorun -w /path/to/workflow/definition/file -d /path/to/workflow/database/file

It is important to note that the times chosen in the crontab has nothing to do with when jobs are submitted. Rocoto determines when jobs are launched based on the cycle times and dependencies defined in the workflow definition file. The interval chosen for running Rocoto corresponds to the maximum amount of time that can pass between when a task's dependencies become satisfied and when that task gets submitted. The longest acceptable interval should be chosen. Excessively short duration between runs of Rocoto place unnecessary load on the system and do not allow the workflow to run any faster. In most cases is it inappropriate for non-realtime workflow to use an interval less than 5 minutes.

Checking the status of workflows

Rocoto provides two tools for checking the status of workflows. These tools display status information and can help diagnose workflow problems.

The rocotostat tool allows the user to query the status of a set of cycles and tasks.

rocotostat [-h] [-v #] -d database_file -w workflow_document [-c cycle_list] [-t task_list] [-m metatask_list] [-a] [-s] [-T]

The -w and -d options identify which workflow run you want to query. The -c option allows you to select specific cycles to query. The -t option allows you to select specific tasks to query (some use of regular expression patterns are allowed). The -T option sorts the output by taskname rather than by cycle. The -s option will provide a summary report of the status of the cycles themselves rather than information about the tasks. The -m option allows you to select specific metatasks to query.

The rocotocheck tool allows users to query detailed information about a specific task for a specific cycle.

rocotocheck [-h] [-v #] -d database_file -w workflow_document [-c cycle_list] [-t task_list] [-m metatask_list] [-a]

The -w and -d options identify which workflow run you want to query. The -c option identifies the cycle of the tasks to query. The -t option identifies the name(s) of the task(s) to query. The -m option identifies the metatask(s) to query. This command provides a lot of detail about the particular tasks, including the reasons (if any) why they can not be submitted.

Forcing tasks to run

Rocoto provides a tool, rocotoboot, that can be used to force a task to be submitted, regardless of whether or not dependencies are satisfied or throttling violations would occur. It CAN NOT be used to force tasks to run that have expired due to deadlines being reached or cycle life spans being exceeded.

rocotoboot [-h] [-v #] -d database_file -w workflow_document [-c cycle_list] [-t task_list] [-m metatask_list] [-a]

The -w and -d options identify which workflow run you want to boot. The -c option identifies the cycle(s) of the task(s) to boot. The -t option identifies the name(s) of the task(s) to boot. The -m option identifies the metatask(s) to boot.

Rerunning tasks (version 1.2.2 and higher)

Rocoto provides a tool, rocotorewind, that can be used to "undo" the execution of tasks. It allows the user to undo the execution of portions of a workflow in the event an error was found, for example. The workflow can then be resumed, and the tasks rerun, after the error is corrected. Although users should always ensure their tasks are idempotent, it is sometimes necessary to perform cleanup when a task is rewound. See the section on the <rewind> tag below for a description of how to define rewind actions for tasks that need to perform cleanup when they are rewound.

rocotorewind [-h] [-v #] -d database_file -w workflow_document [-c cycle_list] [-t task_list] [-m metatask_list] [-a]

The -w and -d options identify which workflow run you want to rewind. The -c option identifies the cycle for the task(s) to be rewound. The -t option identifies the task(s) to rewind. The -m option identifies the metatask(s) to rewind.

Forcing tasks to complete (version 1.3.0 and higher)

Rocoto provides a tool, rocotocomplete, that can be used to force the completion of tasks. It allows the user to mark the execution of portions of a workflow as successfully completed to ignore previous errors, for example. The workflow can then be resumed, and the remaining tasks run to their normal completion. .

rocotocomplete [-h] [-v #] -d database_file -w workflow_document [-c cycle_list] [-t task_list] [-m metatask_list] [-a]

The -w and -d options identify which workflow run you want to modify. The -c option identifies the cycle for the task(s) to be completed. The -t option identifies the task(s) to mark as complete. The -m option identifies the metatask to mark as completed.

Vacuuming databases (version 1.2.2 and higher)

Rocoto provides a tool, rocotovacuum, that can be used to shrink the size of large databases. This is sometimes needed if a database becomes very large and consumes too much disk space or slows performance. However, Rocoto vacuums databases automatically be default, so this command is rarely used. Rocotovacuum allows the user to remove database entries for cycles that have long since completed or expired and are no longer needed. The workflow can then be resumed with a much smaller database. Use this command with caution.

rocotovacuum [-h] [-v #] -d database_file -w workflow_document -a age

The -w and -d options identify which workflow run you want to vacuum. The -a option specifies the maximum age of cycles for data you want to keep .

Rocoto XML Language

Rocoto uses a very simple XML language to define workflows. It is specifically targeted for weather and climate simulations, and is not intended to be a catch-all for any type of simulation research. Therefore, it may not be amenable to certain types of research.

XML Header

Every XML file must have this text at the top:

  <?xml version="1.0"?> 
  <!DOCTYPE workflow
  [
  ]>

This is required by XML parsers, and it tells the parsers what kind of document this is. It has no relevance to anything else in the workflow, and can be pretty much ignored.

ENTITIES

The only exception to the above, is that it is often very nice to define constants, called ENTITIES, that can be referenced in other parts of the document. The use of ENTITIES is crucial for creating documents that are easy to maintain. The idea is that you use the ENTITY to represent a value that is used in lots of places in the XML. Then, if you ever have to change it, you only need to change it in one place. The definition of the ENTITIES go between the [ and the ] in the XML header shown above. For example,

  <?xml version="1.0"?>
  <!DOCTYPE workflow
  [
      <!ENTITY WRF_HOME "/lfs0/projects/wrf/arw/13km">
  ]>

In the above example, an ENTITY called WRF_HOME is defined and given a value of /lfs0/projects/wrf/arw/13km. Now, anytime we need to use /lfs0/projects/wrf/arw/13km later in the document, we can use the WRF_HOME ENTITY instead. Then, if the ENTITY definition for WRF_HOME is changed, all places where it is referenced will see the updated value. This has huge benefits for building XML documents that are easy to maintain.

An ENTITY is referenced by the following syntax: &ENTITY_NAME;

And, you can reference an ENTITY when defining another ENTITY. For example,

  <?xml version="1.0"?>
  <!DOCTYPE workflow
  [ 
      <!ENTITY WRF_HOME  "/lfs0/projects/wrf/arw/13km">
      <!ENTITY LOG       "&WRF_HOME;/log"> 
  ]>

In the above example, the LOG ENTITY references the WRF_HOME entity in its definition. So, changing the value of WRF_HOME will also change the value of LOG. ENTITIES can be used almost anywhere, but can not be used to hold the values of tag or attribute names. They can be used to hold attribute values, however. For example,

  <task name="wrf" maxtries="3">

In the above example, the SCRIPTS ENTITY is used when defining the value of the name attribute in a task tag.

The <workflow> tag

This is the main tag for defining workflows. Everything except for the header described above must be contained within the <workflow> tag. It has attributes for specifying the scheduler and throttling parameters.

The realtime attribute

The realtime attribute defines whether or not the workflow is to be run in realtime, or in retrospective mode. If realtime is set to "T" or "True", the workflow will be run in realtime. If it is set to "F" or "False" it will be run in retrospective mode. The difference between realtime and retrospective mode is in how cycles are activated. In realtime mode, cycles are activated based on the current time of day. That is, the 00Z cycle for Jan 1, 2009 will be activated as soon as the current time of day is actually 00Z Jan 1, 2009. It will never be activated sooner or later than that time. In retrospective mode, there is no time dependency that must be adhered to. Thus, cycles will be activated immediately, in chronological order, subject to throttling constraints (described below). Each time Rocoto runs, new cycles may be activated if the throttling parameters allow it.

The following example illustrates how to set a workflow to run in realtime mode:

  <workflow realtime="T">

      Everything else goes in here

  </workflow>

The scheduler attribute

The scheduler attribute must be set to one of: sge, lsf, torque, moabtorque, or moab. This attribute tells Rocoto which batch system to use when managing the workflow. It is currently not possible to use more than one batch system within the same workflow. On Jet and Zeus, the value should be moabtorque. On Gaea, the value should be moab. On Yellowstone and Geyser, the value should be lsf. On WCOSS, the value should be lsf. NOTE: In order for Rocoto to operate correctly with the Torque and MoabTorque schedulers those resource management systems must be configured to keep information about completed jobs long enough such that Rocoto can retrieve the information after the jobs have finished. It is recommended that the keep time be set to a minimum of 24 hours.

  <workflow realtime="F" scheduler="moabtorque">

      Everything else goes here 

  </workflow>

The cyclelifespan attribute

The cyclelifespan attribute specifies how long a cycle can be active before it expires. The length of time is specified in dd:hh:mm:ss format. This attribute is most useful for realtime workflows in situations where you want to stop processing a cycle after some deadline, regardless of whether it has completed successfully or not. Wall clock time requests are automatically capped such that they will not exceed the time when a cycle expires.

For example, the following will cause cycles to expire one hour after they are activated

  <workflow realtime="F" scheduler="moabtorque" cyclelifespan="0:01:00:00">

      Everything else goes here 

  </workflow>

The cyclethrottle attribute

The cyclethrottle attribute allows you to limit how many cycles may be active at one time. This is probably most useful for retrospective workflows and helps users manage the cpu and disk resources your workflows consume. Cycles are active if they have not expired and not all tasks for the cycle have completed successfully

For example, the following will prevent more than 5 cycles from being active at one time.

  <workflow realtime="F" scheduler="moabtorque" cyclelifespan="0:01:00:00" cyclethrottle="5">

      Everything else goes here 

  </workflow>

The corethrottle attribute

The corethrottle attribute allows you to limit the total number of cores that may be consumed by jobs submitted to the batch system. This is probably most useful for retrospective workflows and helps users manage the cpu and disk resources your workflows consume. Any job that is submitted to the batch system, whether it is running or queued, counts against this total.

For example, the following will prevent more than 3 cores from being consumed by queued or running jobs.

  <workflow realtime="F" scheduler="moabtorque" cyclelifespan="0:01:00:00" cyclethrottle="5" corethrottle="3">

      Everything else goes here 

  </workflow>

The taskthrottle attribute

The taskthrottle attribute allows you to limit how many tasks may be active at one time. This is probably most useful for retrospective workflows and helps users manage the cpu and disk resources your workflows consume. A task is active if a job has been submitted for it and that job has not finished. Any job that is submitted to the batch system, whether it is running or queued, counts against this total.

For example, the following will prevent more than 5 tasks from being active at one time.

  <workflow realtime="F" scheduler="moabtorque" cyclelifespan="0:01:00:00" cyclethrottle="5" taskthrottle="5">

      Everything else goes here 

  </workflow>

The <log> tag

The <log> tag defines the path and name of Rocoto log(s). It can be anything, but usually it is best to define the name of the log to be dynamically dependent on the cycle being processed. This will put everything for a particular cycle in its own log file.

  <workflow realtime="T" scheduler="moabtorque" cyclelifespan="0:01:00:00" cyclethrottle="5" corethrottle="3">

      <log><cyclestr>&LOG;/workflow_@Y@m@d@H.log</cyclestr></log>

      Everything else goes here 

  </workflow>

In the above example, the workflow log is defined to be named /lfs0/projects/wrf/arw/13km/log/workflow_yyyymmddhh.log where yyyymmddhh represents the cycle time. Note the use of the LOG ENTITY to represent the path of the log file. The <cyclestr> tag is special and is described below.

The <cyclestr> tag

It is often necessary to refer to the various components of the current cycle time when defining various aspects of the workflow. For example, when defining the name of the workflow log as in the previous example above, we need to specify the year, month, day, and hour of the cycle. Since the cycle is a dynamic quantity, we can't just put the value in there, we need a special tag to represent it. The cyclestr tag accomplishes this. The <cyclestr> tag uses flags to represent the various time components of the current cycle:

   @a The abbreviated weekday name ("Sun")
   @A The full weekday name ("Sunday")
   @b The abbreviated month name ("Jan")
   @B The full month name ("January")
   @c The preferred local date and time representation ("Thu Jul  5 11:27:59 2012")
   @d Day of month (01..31)
   @H Hour of the day, 24 hour clock (00..23)
   @I Hour of the day, 12 hour clock (01..12)
   @j Day of the year (001..366)
   @m Month of the year (01..12)
   @M Minute of the hour (00..59)
   @p Meridian indicator ("AM" or "PM")
   @P Meridian indicator ("am" or "pm")
   @s Number of seconds since January 1, 1970 00:00:00 UTC
   @S Second of the minute (00..60)
   @U Week number of the current year, starting with the first Sunday as the first day of the first week (00..53)
   @W Week number of the current year, starting with the first Monday as the first day of the first week (00..53)
   @w Day of week (Sunday is 0, 0..6)
   @x Preferred representation for the date alone ("07/05/12")
   @X Preferred representation for the time along ("11:34:58")
   @y Year without century (00..99)
   @Y Year with century
   @Z Time zone name

Rocoto may be processing multiple cycles at once. Any time one of the above flags appears inside a <cyclestr> ... </cyclestr> block, those flags will be replaced with the appropriate date/time component of the current cycle being processed. These flags can be used in any combination to represent any date/time string desired. For example,

  <cyclestr>@Y@m@d@H</cyclestr>

One very nice feature is that other text can appear within a <cyclestr> tag. For example,

  <log><cyclestr>/my/path/to/the/log/file/workflowlog_@Y@m@d@H</cyclestr></log>

The line above shows how the <cyclestr> tag might be used inside a <log> tag to specify the names of Rocoto log files.

The offset attribute

Sometimes it is necessary to be able to represent not just the current cycle, but some offset before or after the current cycle. This can be accomplished through the use of the offset attribute. The offset attribute should be set to the time that needs to be added to the current cycle time to yield the time that is desired. The format of the offset attribute is dd:hh:mm:ss where dd=days, hh=hours, mm=minutes, and ss=seconds. Also, leading fields whose values are 0 do not need to be specified. For example, all of the following are equivalent:

  <cyclestr offset="3600">@Y@m@d@H</cyclestr>

  <cyclestr offset="1:00:00">@Y@m@d@H</cyclestr>

  <cyclestr offset="00:60:00">@Y@m@d@H</cyclestr>

  <cyclestr offset="60:00">@Y@m@d@H</cyclestr>

All of the above represents the yyyymmddHH of the current cycle, plus one hour. The offset can also be negative. To represent one hour earlier than the cycle time, simply make the first character of the offset a minus sign. Use of the full time representation is highly recommended to make your workflows easier to read. Consider, for example, which of the following two equivalent XML is easier to interpret:

  <cyclestr offset="-9:00:00">@H</cyclestr>

  <cyclestr offset="-32400">@H</cyclestr>

Both of the above specify the hour that is 9 hours previous to the current cycle, but most people don't know off the top of their heads that there are 32400 seconds in 9 hours, so the first form is more readable.

The <cycledef> tag

The <cycledef> tag defines the set of cycles the workflow is to be run on. You can have as many <cycledef> tags as you want. In some cases, it may be necessary to use multiple <cycle> tags to represent the set of cycles desired. Rocoto uses the union of all sets of cycles specified by the <cycledef> tags to create the overall cycle pool. If there is overlap between definitions, it will not cause cycles to be run twice. There are two ways to specifies cycles. You may mix and match the two different ways to specify cycles as needed.

The start, stop, step method

The first method for specifying cycles is to specify a start cycle, an end cycle, and an increment. The format of the start and stop cycles is yyyymmddhhmm. The format of the increment is dd:hh:mm:ss.

For example,

  <workflow realtime="T">

      <log><cyclestr>&LOG;/workflow_@Y@m@d@H.log</cyclestr></log>

      <cycledef>201101010000 201112311800 06:00:00</cycledef> 

  </workflow>

The <cycledef> tag in the example above specifies all 00Z, 06Z, 12Z, and 18Z cycles beginning with January 1, 2011 00:00:00 and ending with December 31, 2011 18:00:00.

The crontab-like method

The second method for specifying cycles is to use a crontab-like format to represent groups of cycles. There are six fields that must be defined: minute, hour, day, month, year, and weekday. Each field can be a single value, a range of values, a comma separated list of values, etc.

  <workflow realtime="T">

     <log><cyclestr>&LOG;/workflow_@Y@m@d@H.log</cyclestr></log>

     <cycledef group="15min">*/15 * * * 2006-2010 *</cycledef> 

     <cycledef group="hourly">0 * * * 2006-2010 *</cycledef> 

     <cycledef group="3hourly">0 */3 * * 2006-2010 *</cycledef> 

     <cycledef group="6hrlyJanFeb">0 */6 * 1,2 2006-2010 *</cycledef> 

  </workflow>

The above example shows four <cycledef> tags. Each of these defines a list of minutes, hours, days, months, years, and weekdays. A * is shorthand for all values of that field. So, the * in the second field means all hours. And the * in the third field means all days. In the example above, the first <cycledef> defines a set of cycles consisting of every 15 minutes of every hour for every day and every month of the years 2006-2010. The second <cycledef> tag defines a set of cycles consisting of minute 0 of every hour of every day for the years 2006 thru 2010.

The group attribute

Sometimes it is necessary to define distinct sets of cycles because some tasks should only be run for certain subsets of cycles. The group attribute of the <cycledef> tag allows you to assign a set of cycles to a group that can later be used to control which tasks are run for which cycles. Multiple <cycledef> tags may be assigned to the same group, but a <cycledef> tag may not be assigned to more than one group.

The activation offset attribute (version 1.2 and higher)

Sometimes it is necessary, in realtime mode, for a cycle to be activated before or after the wall clock time reaches the cycle time. In these cases the activation_offset attribute may be used to specify an offset from the cycle time that changes when cycles are activated. Note that the cycle's life span starts when it is activated, not when the wall clock time is equal to the cycle time. The cycle activation offset is in DD:HH:MM:SS format and can be either positive or negative.


<cycledef activation_offset="-01:00:00">0 */6 * * 2016 *</cycledef>

In the above example, the 00Z, 06Z, 12Z, and 18Z cycles would be activated at 23Z, 05Z, 11Z, and 17Z respectively.

The <task> tag

The <task> tag is the bread and butter of the workflow. It defines the computations that you want to run. The example below shows the basic form of the <task> tag. The contents of the <task> tag have been left out for now for clarity.

  <workflow realtime="T">

     <log><cyclestr>&LOG;/workflow_@Y@m@d@H.log</cyclestr></log>

     <cycledef>201101010000 201112311800 06:00:00</cycledef> 
     <cycledef group="15min">*/15 * * * 2006-2010 *</cycledef> 
     <cycledef group="hourly">0 * * * 2006-2010 *</cycledef> 
     <cycledef group="3hourly">0 */3 * * 2006-2010 *</cycledef> 
     <cycledef group="6hrlyJanFeb">0 */6 * 1,2 2006-2010 *</cycledef> 

     <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3" final="false">

     </task>

  </workflow>

In the example above, a task named wrf has been defined. The attributes are described below

The command that does this task's work is the wrf.ksh script in the path represented by the &SCRIPTS; ENTITY. The task is set to run for the set of cycles represented by the cycles whose ids are 6hr, and 12hr. The batch scheduler for this task is SGE. No more than 5 instances of this task can be submitted and/or running simultaneously. Rocoto will make no more than 3 attempts at running this task if it fails. More details are described below. The name attribute

Every task must have a unique name defined. It can be almost anything, but there can never be two tasks in the same workflow with the same name. This name is used to define task dependencies which are described later.

The cycledefs attribute

The cycledefs attribute is optional. If set, its value must be a comma separated list of <cycledef> tag group names. If it is not set, the task will be run for every cycle in the general pool of cycles. The general pool of cycles is the union of all sets of cycles, both with, and without group ids, specified in the <cycledef> tags. If the cycledefs attribute is set, the task will only be run for cycles that are defined by the <cycledef> tags having the group ids listed.

The maxtries attribute

The maxtries attribute is optional. When Rocoto detects that a task has failed, it will attempt to resubmit it. The maxtries attribute limits the number of times a task can be retried before Rocoto gives up.

HINT: If the number of tries has been exhausted, but you want to retry it one more time, just modify the XML on-the-fly and increment this value. Rocoto will rerun the task again unless the cycle has expired.

The throttle attribute (version 1.1 and higher)

The throttle attribute is optional. If set, its value must be a positive integer. The throttle limits the number of instances of the task which may be queued or running at any one time. There is one instance of the task per cycle. Therefore, the task throttle limits the number of cycles for which the task may be active at any one time.

HINT: The <task> throttle attribute should be avoided. In most situations, the various <workflow> throttles and/or the <metatask> throttle provide far better throttling control and should always be preferred. Do not use the <task> throttle unless the other throttling methods cannot meet the throttling requirements of the workflow

The final attribute (version 1.2 and higher)

The final attribute is optional. If set, its value must be either "true" or "false". The final attribute allows you to designate one or more tasks as "final" tasks. A final task is a task that will, upon successful completion, cause the entire cycle to be considered complete and deactivated. This can be useful for preventing stale cycles from accumulating in cases where not every task will execute for all cycles. This scenario can happen in situations where runtime conditions dictate which tasks get executed. Multiple tasks may be designated as "final" tasks, but the first one to complete successfully will cause the cycle to be deactivated. Once a cycle is deactivated due to a "final" task completing successfully, any running tasks will continue to be tracked until they complete, but no tasks will be retried and no other tasks will be started.

The <command> tag

Every <task> tag must contain a <command> tag. This is the command that carries out the task's work and is the command that Rocoto will submit to the batch system for execution. Command arguments are permitted.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

  </task >

The <account> tag

Every <task> tag should contain a <account> tag. However, it is optional to allow for situations where a default may be used. This defines the batch system account/project that Rocoto will use when submitting the task to the batch system for execution. This is usually not the same thing as the user's login name.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

  </task >

The <queue> tag

Every <task> tag should contain a <queue> tag. However, it is optional to allow for situations where a default may be used. This defines the batch system queue or QOS that Rocoto will submit the task to for execution. The queue/QOS name is inherently machine specific.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

  </task >

The <partition> tag (version 1.3.0 and higher)

A <task> tag may contain a <partition> tag for use on batch systems that support partitions (SGE, Moab, Torque, Slurm). This defines the batch system partition that Rocoto will submit the task to for execution. The partition name is inherently machine specific.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <partition>service</partition>

  </task >

The <cores> tag

Every <task> tag must contain one <cores> tag, one <nodes> (see below) tag, or one <native> (see below) tag. The <cores> tag defines the number of cores that Rocoto will request when submitting the task for execution.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

  </task >

The <nodes> tag (version 1.1 and higher)

Every <task> tag must contain one <nodes> tag, one <cores> (see above) tag, or one <native> (see below) tag. The <nodes> tag defines the task geometry (a list of node counts and cores per node) that Rocoto will request when submitting the task for execution. The <cores> tag should always be preferred. The <nodes> tag should only be used if the task has a requirement to use less than all of the available cores on at least one of its nodes. The <nodes> tag is primarily intended to support OpenMP/MPI hybrid codes, MPMD codes that consist of multiple components that must not share nodes, and parallel codes that require some ranks to be on their own nodes for memory consumption reasons.

The format of the contents of the <nodes> tag is very similar to the syntax of the Moab/Torque "-l nodes=..." option. The differences are that the "ppn" (processes per node) is always required, and there is an optional "tpp" (threads per process) that should be used for OpenMP/MPI hybrid tasks. The format of <nodes> is easiest to illustrate with examples.

For single threaded codes (i.e. processes do not spawn threads):

  <nodes>1:ppn=1</nodes>                             <!-- request 1 core on each of 1 nodes --> 
  <nodes>1:ppn=4</nodes>                             <!-- request 4 cores on each of 1 nodes -->
  <nodes>10:ppn=4</nodes>                           <!-- request 4 cores on each of 10 nodes -->
  <nodes>1:ppn=1+10:ppn=12</nodes>            <!-- request 1 core on the first 1 nodes and 12 cores on each of the next 10 nodes -->
  <nodes>1:ppn=1+2:ppn=4+3:ppn=8</nodes>  <!-- request 1 core on the first 1 nodes, 4 cores on each of the next 2 nodes, and 8 cores on each of the last 3 nodes -->

For threaded applications (e.g. OpenMP/Hybrid codes), you must specify the number of threads each process will spawn:

  <nodes>1:ppn=1:tpp=12</nodes>                                     <!-- request 1 core on each of 1 nodes, each process will spawn 12 threads -->
  <nodes>1:ppn=4:tpp=3</nodes>                                       <!-- request 4 cores on each of 1 nodes, each process will spawn 3 threads -->
  <nodes>10:ppn=4:tpp=3</nodes>                                     <!-- request 4 cores on each of 10 nodes, each process will spawn 3 threads -->
  <nodes>1:ppn=1+10:ppn=12</nodes>                               <!-- request 1 core on the first 1 nodes and 12 cores on each of the next 10 nodes, by default none of the processes are threaded -->
  <nodes>1:ppn=1+2:ppn=4:tpp=4+3:ppn=8:tpp=2</nodes>  <!-- request 1 core on the first 1 nodes, 4 cores and 4 threads per process on each of the next 2 nodes, and 8 cores and 2 threads per process on each of the last 3 nodes -->

Finally, the following shows a simple example in context of the <task> tag.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <nodes>1:ppn=1+10:ppn=12</nodes>

  </task >

The <walltime> tag

Every <task> tag must contain a <walltime> tag. This defines the amount of wallclock time that Rocoto will request when submitting the task to for execution. The requested walltime is automatically reduced so as not to exceed the amount of time remaining before the task expires.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

  </task >

The <memory> tag

The <memory> tag is usually only needed for serial tasks. This defines the amount of memory that Rocoto will request when submitting the task to for execution.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

  </task >

The <jobname> tag

Every <task> tag should contain a <jobname> tag because it helps in visual tracking of jobs in queue status outputs. However, it is optional to allow for situations where a default may be used. This defines the name that will be assigned to the job when submitting the task for execution.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

  </task >

The <deadline> tag

Every <task> tag may contain a <deadline> tag to specify a time by which the task must complete successfully. This time should be in YYYYMMDDHHMM format, which means that its contents should usually be a <cyclestr> tag with a positive offset. Wall clock time requests specified in <walltime> tags are automatically capped at job submission such that they will not exceed the deadline.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

     <deadline><cyclestr offset="2:00:00">@Y@m@d@H@M</cyclestr></deadline>

  </task >

The <stdout>, <stderr>, and <join> tags

Every <task> tag should contain either both <stdout> and <stderr> tags, or a <join> because it helps in tracking and finding the output of your jobs. However, they are optional to allow for situations where defaults may be used. These tags define the location of the stdout and stderr output of the job that executes the task. If either or both <stdout> and <stderr> are used, <join> may not be used. If <join> is used, neither <stdout> or <stderr> may be used. The <join> tag is used when you want both stdout and stderr to go to the same place. DO NOT set the value of <stdout> and <stderr> to the same thing; use <join> to do that.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

     <join>/home/harrop/test/log/test.join</join>

  </task >

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

     <stdout>/home/harrop/test/log/test.out</stdout>

     <stderr>/home/harrop/test/log/test.err</stderr>

  </task >

The <native> tag

This tag may be used to define raw batch system options that Rocoto will use when submitting jobs for this <task>. This is useful in cases where an option is required but is not (yet) implemented as a generic tag. For example, this can be used when requesting advanced reservations. Multiple instances of <native> are not allowed. If more than one native option is required, all of them must be specified within a single <native> tag.

Starting in version 1.3.0, <native> may be used in lieu of <cores> and <nodes> to request complex resources that requires specifying names of consumable resources.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

     <join>/home/harrop/test/log/test.join</join>

     <native>-m e -M Christopher.W.Harrop@noaa.gov</native>

  </task >

The <envar> tag

The <envar> tag is used inside <task> tags to define environment variables that must be passed to the task when it is executed. It consists of (name,value) pairs. All <envar> tags must contain a <name> tag. However, <value> tags can be absent in cases where a variable needs to be set, but does not need to be assigned a value.

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     <command><cyclestr>&SCRIPTS;/wrf.ksh -c @Y@m@d@H</cyclestr></command>

     <account>dtc</account>

     <queue>nserial</queue>

     <cores>1</cores>

     <walltime>00:00:10</walltime>

     <memory>512M</memory>

     <jobname>test</jobname>

     <join>/home/harrop/test/log/test.join</join>

     <native>-m e -M Christopher.W.Harrop@noaa.gov</native>

     <envar>
        <name>ANALYSIS_TIME</name> 
        <value><cyclestr>@Y@m@d@H@M</cyclestr></value> 
     </envar> 
     <envar> 
        <name>WRF_HOME</name>
        <value>&WRF_HOME;</value> 
     </envar> 

  </task >

In the above example, the ANALYSIS_TIME and WRF_HOME environment variables will be set to the values given (note the use of the cyclestr tags and ENTITIES) and passed to the task when it executes. This is an alternative method for passing parameters to the scripts when tasks are run

The <rewind> tag

The <rewind> tag is optional. It is used to specify one or more rewind "actions" that are to be executed in the event a user runs the rocotorewind command to "undo" the task. The rewind actions are commands that undo the effects produced by a task when it runs. Typically, the <rewind> actions perform housekeeping chores such as removal of output files, restoration of input files, etc. Users should always strive to ensure that all tasks are idempotent. Idempotency can largely remove the need for <rewind> and will increase workflow reliability. However, sometimes a task will produce output that is used to trigger downstream tasks via their <datadep> tags. In those cases the old output must be removed in order to properly rerun those downstream tasks using the new output that is generated by the task that is rewound. The contents of the <rewind> tag must be one or more <sh> or <rb> tags that contain the commands required to perform the cleanup chores. See the descriptions of <sh> and <rb> below in the <dependency> section for details.

<task name="wrf">
...

  <rewind>
    <sh>rm -f /path/to/output/file</sh>
  </rewind>

...

</task>

The above illulstrates how one can use <rewind> to ensure removal of output generated by previous runs in order to preserve workflow integrity when a task is rerun with the rocotorewind command.

The <dependency> tag

<dependency> tags are fundamental to every workflow. They are used inside <task> tags, and are used to describe the inter-dependencies of the tasks. It is this tag that is used to define things like "task B can't run until task A is complete", or "Task A can't run until this file shows up". Dependencies are defined as boolean expressions, which means that you have quite a lot of control over when tasks are eligible to run. There are three types of dependencies:

Task Dependencies Data Dependencies Time Dependencies These dependencies can be combined in any combination in a boolean expression to form a task's overall dependency.

   <dependency> 

     Dependencies go here          

   </dependency>

The above example shows the basic format of the <dependency> tag. Inside the <dependency> tag there must be exactly one tag. The one tag inside the <dependency> tag must be one of the following tags:

  <taskdep>
  <datadep>
  <timedep>
  <streq>
  <strneq>
  </true>
  </false>
  <sh>
  <ruby>
  <cycleexistdep>  
  <and>
  <or>
  <not>
  <nand>
  <nor>
  <xor>
  <some>

The <taskdep> tag

Task dependencies are defined inside <dependency> tags with the <taskdep> tag as in the following example:

  <task name="wrf" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     Other tags left out for clarity

     <dependency> 
        <taskdep task="real"/> 
     </dependency>

  </task>

The task attribute of the <taskdep> tag must be set to the name of the task which must complete in order for the dependency to be satisfied. The task attribute must refer to a task that has already been defined above in the XML document. That requirement ensures there are no circular dependencies.

The cycle_offset attribute

In some cases a task may have a dependency on a task from a different cycle. The cycle_offset attribute allows users to specify an offset from the current cycle to define inter cycle dependencies. The offset can be positive or negative. The following example illustrates a dependency on a task from the cycle 6 hours prior to the current cycle.

  <task name="ungrib" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     Other tags left out for clarity

     <dependency> 
        <taskdep task="wrfpost_f006" cycle_offset="-6:00:00"/> 
     </dependency>

  </task>

The state attribute

The "state" attribute is optional. It allows you to specify whether you want the dependency to be satisfied when a task has completed successfully, or when a task has failed and exhausted retries. The state attribute may be set to either "Succeeded" or "Dead". The default value is Succeeded if the attribute is not specified. For example, the following is a dependency that will be satisfied when task completes successfully.

  <taskdep state="succeeded" task="X"/>

For example, the following is a dependency that will be satisfied when task X is has failed and has also exhausted retries

  <taskdep state="Dead" task="X"/>

The <metataskdep> tag (version 1.1 and higher)

Metatask dependencies are defined inside <dependency> tags with the <metataskdep> tag as in the following example:

  <task name="plots" cycledefs="3hourly,6hrlyJanFeb" maxtries="3">

     Other tags left out for clarity

     <dependency>
        <metataskdep metatask="posts"/>
     </dependency>

  </task>

The metatask attribute of the <metataskdep> tag must be set to the name of the metatask whose contents must complete in order for the dependency to be satisfied. The metatask attribute must refer to a task that has already been defined above in the XML document. in the following example:

  <datadep age="120"><cyclestr>&WRF_HOME;/wrfprd/wrfout_d01_@Y-@m-@d_@H:@M:@S</cyclestr></datadep>

File dependencies are satisfied when the file in question exists, has not been modified for at least the amount of time specified by age, and is at least as large as the size specified by minsize

The age attribute

The age attribute of the <datadep> tag is optional. It contains the time (in dd:hh:mm:ss format) that the file must not be modified before the file is considered to be available. This is useful for preventing partially written files from erroneously triggering the submission of a task. As with the offset attributes of the <cyclestr> and cycle component tags, leading 0's can be left off. The default for this attribute is 0. For examples, see below.

The minsize attribute

The minsize attribute of the <datadep> tag is optional. It contains the minimum size for the file before it is considered to be available. This is useful for preventing partially filled files from erroneously triggering the submission of a task. The default value for this attribute is 0. The units of the minsize attribute may optionally be specified by setting the last character of the value to one of

B or b: Size in bytes K or k: Size in kilobytes M or m: Size in megabytes G or g: Size in gigabytes If no units are specified, the units are bytes.

  <datadep age="120" minsize="1024"><cyclestr>&WRF_HOME;/wrfprd/wrfout_d01_@Y-@m-@d_@H:@M:@S</cyclestr></datadep>

  <datadep age="00:00:02:00" minsize="1024B"><cyclestr>&WRF_HOME;/wrfprd/wrfout_d01_@Y-@m-@d_@H:@M:@S</cyclestr></datadep>

  <datadep age="02:00" minsize="1024b"><cyclestr>&WRF_HOME;/wrfprd/wrfout_d01_@Y-@m-@d_@H:@M:@S</cyclestr></datadep>

  <datadep age="00:02:00" minsize="1K"><cyclestr>&WRF_HOME;/wrfprd/wrfout_d01_@Y-@m-@d_@H:@M:@S</cyclestr></datadep>

All the lines above are equivalent

The <timedep> tag

Time dependencies are defined with the <timedep> tag. The value between the start and end of the <timedep> tag is a time in yyyymmddhhmmss format. Time dependencies are satisfied with the wall clock time is equal to or greater than the time specified. All times are calculated in GMT.

  <timedep><cyclestr offset="&DEADLINE;">@Y@m@d@H@M@S</cyclestr></timedep>

The <streq> tag (version 1.2.2 and higher)

The string comparison tags can be used to create dependency switches in your XML that allow you to manually (but easily) turn parts of a workflow on/off. This can be useful in cases where there are multiple possible workflow configurations that are very similar to each other. Rather than creating and maintaining separate workflows for each mostly identical configuration, these switches can be used to define a single workflow that can then be configured for various scenarios. For example, you could turn the "GSI" task of your workflow on/off with a <streq> dependency defined in each task that runs only when the GSI is turned "on":

<?xml version="1.0"?>
<!DOCTYPE workflow
[
 ...
  <!ENTITY RUN_GSI "YES">
]>

...

<dependency>
  <streq><left>&RUN_GSI;</left><right>YES</right></streq>
</dependency>

The &RUN_GSI; ENTITY would be declared in the XML header and it acts as the switch; modifying its value controls whether the GSI component is included in the workflow configuration.

The <strneq> tag (version 1.2.2 and higher)

The string comparison tags can be used to create dependency switches in your XML that allow you to manually (but easily) turn parts of a workflow on/off. This can be useful in cases where there are multiple possible workflow configurations that are very similar to each other. Rather than creating and maintaining separate workflows for each mostly identical configuration, these switches can be used to define a single workflow that can then be configured for various scenarios. For example, you could use a <strneq> dependency to turn on the tasks that run only when the GSI is NOT turned "on":

<?xml version="1.0"?>
<!DOCTYPE workflow
[
 ...
  <!ENTITY RUN_GSI "YES">
]>

...

<dependency>
   <strneq><left>&RUN_GSI;</left><right>YES</right></strneq>
</dependency>

The &RUN_GSI; ENTITY would be declared in the XML header and it acts as the switch; modifying its value controls whether the non-GSI-only components are included in the workflow configuration.

The </true> tag (version 1.2.2 and higher)

The boolean constant tags function similarly to the string comparison flags. Their intended purpose is to act as switches that allow you to configure a workflow definition by turning portions of the workflow on/off using an ENTITY

<?xml version="1.0"?>
<!DOCTYPE workflow
[
 ...
  <!ENTITY RUN_GSI "</true>">
]>

...

<dependency>
  &RUN_GSI;
</dependency>

The above illustrates how you could use an ENTITY assigned to </true> in a task dependency to turn a task on.

The </false> tag (version 1.2.2 and higher)

<?xml version="1.0"?>
<!DOCTYPE workflow
[
 ...
  <!ENTITY RUN_GSI "</false>">
]>

...

<dependency>
  &RUN_GSI;
</dependency>

The above illustrates how you could use an ENTITY assigned to </false> in a task dependency to turn a task off.

The <sh> tag (version 1.2.2 and higher)

The <sh> tag allows you to create a dependency whose value is determined by the exit status of a UNIX shell command. If the exit status of the command is 0, the dependency evaluates to true. Otherwise, the dependency evaluates to false. This can simplify workflow dependencies for complex situations where a task's completion status may not convey enough information to decide whether or not to trigger downstream tasks. A common workaround was to augment the task such that it creates "flag" files that are then used in downstream <datadep> dependencies. The <sh> tag is a simpler alternative. By default, the contents of the <sh> tag are executed by a "/bin/sh -c" command.

<dependency>
  <sh>grep 'RUN_COUPLED=YES' /path/to/file/ocean_status</sh>
</dependency>

The above is an example about how you might use a <sh> tag in a dependency to trigger parts of a workflow that depend on different workflow scenarios, such as whether or not to run a model in coupled mode, that are determined at runtime.

The shell attribute

The shell attribute of the <sh> tag is optional. It specifies the shell that is to be used to execute the command. For example, "/bin/sh", "/bin/csh", "/bin/ksh", etc. The default value is "/bin/sh".

<dependency>
  <sh shell="/bin/bash">grep 'RUN_COUPLED=YES' /path/to/file/ocean_status</sh>
</dependency>

The runopt attribute

The runopt attribute of the <sh> tag is optional. It specifies the options to be used for the shell command (given by the shell attribute) when executing the shell code. For example, "-c", "-l", etc. The default value, which should rarely be overridden, is "-c".

<dependency>
  <sh shell="/bin/bash" runopt="-l">grep 'RUN_COUPLED=YES' /path/to/file/ocean_status</sh>
</dependency>

The <rb> tag (version 1.2.2 and higher)

Waiting for use case and example from developer

The <cycleexistdep> tag (version 1.2.2 and higher)

The <cycleexistdep/> tag allows you to trigger tasks based on whether or not a particular cycle is defined (with <cycledef>) for the workflow. A common use for this dependency is to solve the problem of bootstrapping a series of cycled runs where each run depends on output from tasks of a previous cycle. For cycled runs, the first cycle will have dependencies on tasks from a cycle that doesn't exist. In previous versions of Rocoto, the only way to start such a cycled run was to use rocotoboot to manually force the first cycle to start. Now, the <cycleexistdep> tag allows Rocoto to start the runs automatically without manual intervention. The cycle_offset attribute, described below, is required.

<dependency>
  <not>
    <cycleexistdep cycle_offset="-6:00:00"/> 
  <not>
</dependency>

The example above shows how you can use <cycleexistdep/> to trigger a task when the cycle six hours prior to the current one doesn't exist.

The cycle_offset attribute

The cycle_offset attribute of the <cycleexistdep/> tag is mandatory. It specifies a time offset from the current cycle in DD:HH:MM:SS format. The offset can be positive or negative.

The boolean operator tags

There are several boolean operators that may be used to compose boolean expressions of dependencies. The operators and the operands (<taskdep>, <datadep>, <timedep>) can be combined without limit. The following lists the functions of the operators:

<and>  - Satisfied if all enclosed dependencies are satisfied
<or>   - Satisfied if at least one of the enclosed dependencies is satisfied
<not>  - Satisfied if the enclosed dependency is not satisfied
<nand> - Satisfied if at least one of the enclosed dependencies is not satisfied
<nor>  - Satisfied if none of the enclosed dependencies are satisfied
<xor>  - Satisfied if exactly one of the enclosed dependencies is satisfied
<some threshold="n">  - Satisfied if the fraction of enclosed dependencies that are satisfied exceeds some threshold (0 <= threshold <=1)

The example below illustrates how a complex dependency can be created. The example says "start this task if it is at least 25 minutes past the cycle time, and either (1) it is at least 50 minutes past the cycle time, or (2) the one hour forecast is available, or (3) It is at least 45 minutes past the cycle time and at least one of the 2,3,4,5, or 6 hour forecasts is available"

  <dependency>
    <and>
        <timedep><cyclestr offset="25:00">@Y@m@d@H@M@S</cyclestr></timedep>
        <or>
          <timedep><cyclestr offset="50:00">@Y@m@d@H@M@S</cyclestr></timedep>
          <datadep age="120"><cyclestr offset="-1:00:00">&WRF_HOME;/postprd/@Y@m@d@H000001.grib</cyclestr></datadep>
          <and>
             <timedep><cyclestr offset="45:00">@Y@m@d@H@M@S</cyclestr></timedep>
             <or> 
                <datadep age="120"><cyclestr offset="-2:00:00">&WRF_HOME;/postprd/@Y@m@d@H000002.grib</cyclestr></datadep>
                <datadep age="120"><cyclestr offset="-3:00:00">&WRF_HOME;/postprd/@Y@m@d@H000003.grib</cyclestr></datadep>
                <datadep age="120"><cyclestr offset="-4:00:00">&WRF_HOME;/postprd/@Y@m@d@H000004.grib</cyclestr></datadep>
                <datadep age="120"><cyclestr offset="-5:00:00">&WRF_HOME;/postprd/@Y@m@d@H000005.grib</cyclestr></datadep>
                <datadep age="120"><cyclestr offset="-6:00:00">&WRF_HOME;/postprd/@Y@m@d@H000006.grib</cyclestr></datadep>
             </or>
          </and>
       </or> 
    </and> 
  </dependency>

The <hangdependency> tag

The <hangdependency> tag is used to specify a condition which indicates a task has hung. This is useful for situations where a system or programming error has caused a job to hang. This tag tells Rocoto how to distinguish between a job that is running normally, and one that is hung. The <hangdependency> works the same way the <dependency> tag does. However, when a <hangdependency> is satisfied, the job associated with the task is killed. The task will be retried if the maximum retry count has not been exceeded. A common way to use this is to use a <datadep> with a relatively long age attribute on an output file with the idea of detecting when a job stops writing output.

The <metatask> tag

<Metatask> tags are used to define large groups of tasks that are very similar to each other. For example, it is often the case that the post-processing portion of a workflow contains many tasks which are nearly identical. In some cases, the only difference between these tasks is the name of the model output file that they process. Another example is ensembles. For an ensemble, the only difference between some tasks might be an environment variable that contains the ensemble member ID. Long lists of tasks that are nearly identical are difficult to maintain. The <metatask> tag allows one to create a template for a set of tasks. This template specifies the things that change, and the list of values for each of the items that vary. When Rocoto parses the XML, it expands the metatask, duplicating the task template for each value in the lists of items that change. This allows for a compact representation of a large number of nearly identical tasks, and facilitates maintenance of the workflow definition. A <metatask> tag can be used inside <workflow> tags or inside other <metatask> tags (yes, they can be nested).

The best way to explain <metatask> tags is to show an example (many details have been removed for clarity):

  <workflow>

    <metatask>

      <var name="var1">item1 item2 itemN</var>
      <var name="var2">itemA itemB itemZ</var>

      <task name="mytask_#var1#">

        <envar> 
          <name>MyVar</name>
          <value>MyValue_#var2#</value>
        </envar> 

      </task>

    </metatask>

  </workflow>

The lines above define a <metatask> that contains two metatask <var> tags and one <task> tag. Every <metatask> must contain at least one <var> tag. The <var> tags are used to define variables that can be referenced in the rest of the <metatask> using the #varname# syntax. In the example above, two metatask variables, var1 and var2, are defined. Each <var> tag contains a list of values, separated by spaces. All the <var> tags of a given <metatask> must contain the same number of values. The number of values in the <var> tags determines how many tasks are represented by the <metatask>. When Rocoto parses the XML, it expands <metatask> tags into the full set of tasks which they represent. This is done by iterating over the list of values specified in the <var> tags and creating one instance of the contents of the <metatask> for each value. For example, the <metatask> above will automatically be expanded to the following XML:

  <workflow>

     <task name="mytask_item1">

        <envar> 
           <name>MyVar</name>
           <value>MyValue_itemA</value>
        </envar> 

     </task

     <task name="mytask_item2">

        <envar> 
           <name>MyVar</name>
           <value>MyValue_itemB</value>
        </envar> 

     </task

     <task name="mytask_itemN">

         <envar> 
           <name>MyVar</name>
           <value>MyValue_itemZ</value>
        </envar> 

     </task

  </workflow>

Since a <metatask> can contain other <metatask> tags, you can represent a set of tasks that vary along multiple dimensions. For example, if you wanted to represent the post-processing tasks for a 10 member ensemble, where there is one post task for each model output file, you could use something like this:

  <metatask>

      <var name="member">01 02 03 04 05 06 07 08 09 10</var>

      <metatask>

         <var name="forecast">00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48</var>

         <task name="post_#member#_#forecast#">

            <envar>
               <name>MEMBER_ID</name>
               <value>#member#</value>
            </envar>

            <envar>
               <name>FCST</name>
               <value>#forecast#</name>
            </envar>

         </task>

      </metatask>

  </metatask>

The above would create tasks for each possible pair of values in the member and forecast <var> tags. The above represents 170 tasks with only one block of XML!

The name attribute (version 1.1 and higher)

The name attribute is optional and can be used to assign a name to a <metatask>. The name can then be referenced in <metataskdep> tags to declare dependencies on the entire contents of a <metatask>. All <metatask> name attributes must be unique.

The mode attribute (version 1.1 and higher)

The mode attribute is optional and is used to tell Rocoto if the contents of the <metatask> tag are to be run in parallel or in sequential order. The mode may either be "serial" or "parallel". The default mode is "parallel". If the mode is "serial" the <metatask> tag's immediate children will be run in sequential order. The task dependencies required to accomplish execution in sequential order will be inserted automatically. If a <metatask> contains another <metatask>, the contents of the enclosed <metatask> will not be serialized unless its mode attribute is also set to "serial".

Consider the following example, which is a slight variation of the example above, that illustrates the use of the name and mode attributes

  <metatask name="posts">

      <var name="member">01 02 03 04 05 06 07 08 09 10</var>

      <metatask mode="serial">

         <var name="forecast">00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48</var>

         <task name="post_#member#_#forecast#">

            <envar>
               <name>MEMBER_ID</name>
               <value>#member#</value>
            </envar>

            <envar>
               <name>FCST</name>
               <value>#forecast#</name>
            </envar>

         </task>

      </metatask>

  </metatask>

  <task name="hurricane_track_plots" ..... >
   .
   .
   .
    <dependency>
      <metataskdep metatask="posts"/>
    </dependency>

  </task>

The above shows a <metatask> named "posts" which contains all the post-processing tasks for all members of an ensemble. Inside "posts" is an unnamed <metatask> that contains the post tasks for each forecast lead time for a given ensemble member. The "mode" of the inner <metatask> is set to "serial", meaning that its contents must be run sequentially instead of in parallel. The "mode" of the "posts" metatask was not specified, so its contents can run in parallel. What the above means is that, for a given ensemble member, the post tasks can not run at the same time and must run one after the other, in order of forecast lead time. However, since there is no dependency between the post tasks of different ensemble members, the sequence of post tasks for a given member can run in parallel with the sequence of post tasks of any other member. Finally, at the bottom is a task with a <metataskdep> dependency. The <metataskdep> tag declares that the hurricane_track_plots task can not run until all tasks contained in the "posts" <metatask> are completed successfully (i.e. all posts for all forecast lead time for all ensemble members are done successfully).