Recovery & Failover

From Obsidian Scheduler
Revision as of 18:14, 21 February 2011 by Craig (talk | contribs)
Jump to navigationJump to search

Espresso ensures continuity in the execution of your scheduled jobs by providing a variety of built-in recovery mechanisms.

  • Multiple concurrent hosts support
  • Instance outage recovery
  • Configurable recovery options by job
  • Resubmission of abnormally terminating jobs

Multiple Concurrent Hosts

Out-of-the-box, Espresso can be run with as many hosts for which you are licensed. As long as at least one host is running and the jobs are not constrained to specific hosts, any server failures will not prevent jobs from being run on schedule. Any jobs running when a server fails will be recovered as defined in Instance Outage Recovery. No special configuration is required to run multiple hosts. Just start up a node and it joins the available service pool.

Instance Outage Recovery

When a given server fails, any jobs that were in the midst of running cannot be completed normally. Any other running hosts will discern that the jobs have not had any activity and they will be marked as Died. This will allow other hosts to run the job for any subsequently scheduled times. And since no special configuration is required to run multiple hosts, when the issue with the server failure is resolved, simply start it up again and it joins the pool. A given instance will reuse its licence as long as it has not been claimed by another host.