Skip to main content

Auto-Restart Rules

Auto-Restart Rules enable automatic recovery from transient job failures. Configure pattern-based rules to restart jobs automatically when specific errors occur, with built-in protection against restart loops.


Overview

Why Auto-Restart?

Many job failures are transient and resolve themselves:

  • Network timeouts: Temporary connectivity issues

  • Deadlocks: Database contention that clears

  • Service unavailability: External services temporarily down

  • Resource constraints: Temporary memory or CPU pressure

Auto-restart rules automatically recover from these failures without manual intervention.

How It Works

  1. Job fails with error message

  2. Error message matched against rule patterns

  3. If match found, restart scheduled with configured delay

  4. Process repeats until success or max attempts reached

  5. Cooldown activates if max attempts exceeded


Creating Restart Rules

Step 1: Open Restart Rules

  1. Search for Auto-Restart Rules

  2. Or from Job Queue Admin Setup: Actions > Navigate > Auto-Restart Rules

image-20260208-193504.png

Step 2: Create New Rule

  1. Click New to create a new rule

  2. Fill in the required fields:

FieldValueExample
Rule NameDescriptive name"Network Timeout Recovery"
Error Message PatternPattern to matchtimeout
EnabledYesYes
PriorityProcessing order100

Step 3: Configure Restart Behavior

Set how restarts should behave:

FieldDescriptionRecommended
Maximum Restart AttemptsMaximum restarts before cooldown3
Restart Delay (Seconds)Initial delay before restart60
Use Exponential BackoffIncrease delay each attemptYes

Step 4: Configure Cooldown

Set cooldown behavior after max attempts:

FieldDescriptionRecommended
Enable Cooldown After Repeated FailuresEnable cooldown protectionYes
Cooldown Duration (Minutes)How long to pause restarts30

Step 5: Set Active Schedule (Optional)

Limit when rule is active:

FieldDescription
Active On WeekdaysRule active Monday-Friday
Active On WeekendsRule active Saturday-Sunday
Active From TimeStart of active window
Active To TimeEnd of active window

Rule Fields Reference

Identification

FieldDescriptionRequired
Entry No.Auto-assigned unique IDAuto
Rule NameUnique rule nameYes
EnabledActive/inactive toggleYes
PriorityProcessing order (lower = first)Yes

Pattern Matching

FieldDescriptionRequired
Error Message PatternPattern to match in error messageNo
Do NOT Restart If Error ContainsPattern to exclude from matchingNo
Job Queue Category CodeLimit to specific job categoryNo
Company FilterLimit to specific companies (comma-separated)No

Restart Settings

FieldDescriptionDefault
Maximum Restart AttemptsMaximum restart tries3
Restart Delay (Seconds)Initial delay in seconds60
Use Exponential BackoffDouble delay each attemptNo

Cooldown Settings

FieldDescriptionDefault
Enable Cooldown After Repeated FailuresEnable cooldown after max attemptsNo
Cooldown Duration (Minutes)Cooldown period length60
Cooldown Active UntilCurrent cooldown end time (read-only)System

Schedule Settings

FieldDescriptionDefault
Active On WeekdaysMonday-FridayYes
Active On WeekendsSaturday-SundayYes
Active From TimeDaily start time00:00
Active To TimeDaily end time00:00

Pattern Matching

Wildcard Support

Error patterns support wildcards:

Pattern

Matches

*timeout*

Any error containing "timeout"

*connection*

Any error containing "connection"

deadlock

Errors containing "deadlock"

*

All errors (use with caution)

Case Sensitivity

  • Pattern matching is case-insensitive

  • *Timeout* matches "timeout", "TIMEOUT", "TimeOut"

Exclude Patterns

Use exclude patterns to prevent matching specific errors:

Error Message Pattern

Do NOT Restart If Error Contains

Result

*timeout*

*permanent*

Matches "connection timeout" but not "permanent timeout"

*error*

*fatal*

Matches "recoverable error" but not "fatal error"

Example Patterns

Use Case

Error Message Pattern

Do NOT Restart If Error Contains

Network issues

*timeout*

Database locks

*deadlock*

Service unavailable

*503*

All except critical

*

*fatal*

Connection drops

*connection*reset*


Exponential Backoff

How It Works

With exponential backoff enabled, each restart attempt waits longer:

AttemptBase DelayMultiplierActual Delay
160s2^0 = 160 seconds
260s2^1 = 2120 seconds
360s2^2 = 4240 seconds

When to Use

Enable for:

  • External service calls (give service time to recover)

  • Database operations (let contention clear)

  • Resource-intensive jobs (let system recover)

Disable for:

  • Quick retries needed

  • Time-sensitive jobs

  • Known short-duration issues


Cooldown Protection

Purpose

Cooldown prevents restart loops when errors persist:

  • Stops wasting resources on unrecoverable errors

  • Prevents notification storms

  • Gives time for manual investigation

Triggering Cooldown

Cooldown activates when max restart attempts is reached for a job.

Cooldown Period

During cooldown:

  • No automatic restarts for matching errors

  • Notifications are still sent when jobs fail

  • Manual restart still possible

  • Cooldown ends automatically after duration

Clearing Cooldown

Automatic: Cooldown expires after configured duration

Manual: Use Clear Cooldown action for a selected rule, or Clear All Cooldowns to reset all rules

image-20260208-203413.png


Rule Priority

How Priority Works

When multiple rules match an error:

  1. Rules sorted by Priority (ascending)

  2. First matching rule is used

  3. Lower number = higher priority

Priority Strategy

PriorityRule TypeExample
10-50Specific patterns"Specific API timeout"
50-100General patterns"Any timeout"
100+Catch-all rules"All errors"

Example Priority Setup

Priority

Rule Name

Pattern

10

Salesforce API

*salesforce*timeout*

20

Payment Gateway

*payment*unavailable*

50

Any Timeout

*timeout*

100

All Errors

*


Schedule Configuration

Active Hours

Limit when restarts occur:

ScenarioWeekdaysWeekendsFromTo
Business hours onlyYesNo08:0018:00
24/7YesYes00:0000:00
Nights onlyYesYes22:0006:00
Weekends onlyNoYes00:0000:00

Overnight Windows

For overnight windows (e.g., 22:00 to 06:00):

  • Set Active From Time = 22:00

  • Set Active To Time = 06:00

  • System detects overnight span automatically


Best Practices

Pattern Design

  1. Be specific: Narrow patterns reduce false matches

  2. Use exclude patterns: Prevent matching critical errors

  3. Test patterns: Verify matching before enabling

Restart Settings

  1. Start conservative: 3 attempts, 60-second delay

  2. Enable backoff: For external service calls

  3. Monitor results: Adjust based on success rate

Cooldown Configuration

  1. Enable for persistent issues: Prevents restart loops

  2. 60 minutes default: Usually sufficient for investigation

  3. Clear manually if needed: Use Clear Cooldown action

Rule Organization

  1. Specific before general: Use priority effectively

  2. Use clear rule names: Include the error type in the name

  3. Review periodically: Remove obsolete rules


Monitoring & Troubleshooting

View Active Cooldowns

  1. Open Auto-Restart Rules

  2. Check Cooldown Active Until column

  3. Rules in cooldown show future datetime

View Restart History

  1. Open Notification Log

  2. Filter by Trigger Event = "OnAutoRestart"

  3. See all auto-restart notifications

Common Issues

IssueCauseSolution
Rule not matchingPattern too specificWiden pattern or check spelling
Too many restartsPattern too broadNarrow pattern or add exclude
Cooldown not clearingDuration too longReduce duration or clear manually
Wrong rule matchingPriority incorrectAdjust priorities

Next Steps: Configure Maintenance Windows to suppress restarts during planned maintenance, or set up Digest Schedules for summary reports.