Version v1.0 · dotnet

Reliability & Execution Model

Lease-based claiming, run lifecycle, retries, and durability guarantees.

DurableStack executes runs through a durable store-backed lifecycle with lease-based ownership.

The goal is predictable behavior during normal operation, transient failures, and process interruptions.

Run lifecycle

A run typically moves through:

If a failure is retry-eligible, the run is re-scheduled as pending for a future retry time.

For each processing loop:

Worker claims due runs with a lease (ClaimDueRunsAsync).
Worker emits claimed/started events.
Job executes through the configured runner.
On success, run is marked succeeded.
On exception, retry eligibility is evaluated and run is marked failed with or without retry scheduling.

Retry eligibility is based on attempt count:

Delay calculation uses:

During execution, DurableStack extends the lease periodically.

Heartbeat extension interval is half the lease duration (minimum 250ms).
If a worker dies or stops extending lease, the run becomes reclaimable after lease expiry.

This enables automatic recovery from worker interruption.

Claiming is implemented with provider-specific concurrency primitives:

DurableStack is designed for effectively-once processing in normal operation with durable retry behavior.

Because distributed systems can re-attempt after failures and lease expiration, handlers should be idempotent.