--- id: wiki-2026-0508-failable-task-handling title: Failable Task Handling category: 10_Wiki/Topics status: verified canonical_id: self aliases: [error handling, retry, circuit breaker, exponential backoff, idempotent, saga compensation] duplicate_of: none source_trust_level: A confidence_score: 0.97 verification_status: applied tags: [reliability, error-handling, retry, circuit-breaker, idempotency, distributed-systems] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: TypeScript / Python / Go framework: Temporal / Polly / tenacity --- # Failable Task Handling ## 매 한 줄 > **"매 task 의 fail 의 expect 의 design"**. 매 retry + idempotency + timeout + circuit breaker + DLQ + compensation. 매 distributed system 의 의 의 기본. 매 modern: Temporal, Inngest 의 durable execution. ## 매 핵심 ### 매 strategy - **Retry** with backoff. - **Timeout**. - **Idempotency**. - **Circuit breaker** (Hystrix-style). - **Bulkhead** (resource isolation). - **DLQ** (dead letter). - **Compensation** (saga). - **Fallback / cached default**. ### 매 retry pattern - **Constant**: 매 fixed delay. - **Linear**: 매 N × delay. - **Exponential**: 매 2^N × delay. - **+ Jitter**: 매 thunder herd 방지. ### 매 응용 1. **HTTP**: 매 5xx retry. 2. **Distributed transaction**: 매 saga. 3. **Job queue**: 매 retry + DLQ. 4. **LLM API**: 매 rate limit retry. 5. **Workflow**: 매 Temporal durable. ## 💻 패턴 ### Exponential backoff with jitter ```python import random, time def retry_exponential(fn, max_attempts=5, base=0.1, max_delay=10): for attempt in range(max_attempts): try: return fn() except RetryableError: if attempt == max_attempts - 1: raise delay = min(base * 2 ** attempt + random.random() * base, max_delay) time.sleep(delay) ``` ### tenacity (Python) ```python from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type @retry( stop=stop_after_attempt(5), wait=wait_exponential(multiplier=1, min=1, max=10), retry=retry_if_exception_type(httpx.HTTPStatusError), ) def fetch(url): return httpx.get(url) ``` ### Circuit breaker ```python class CircuitBreaker: def __init__(self, fail_threshold=5, reset_timeout=60): self.failures = 0 self.state = 'closed' self.opened_at = None self.fail_threshold = fail_threshold self.reset_timeout = reset_timeout def call(self, fn): if self.state == 'open': if time.time() - self.opened_at > self.reset_timeout: self.state = 'half_open' else: raise CircuitOpenError() try: result = fn() if self.state == 'half_open': self.state = 'closed' self.failures = 0 return result except Exception: self.failures += 1 if self.failures >= self.fail_threshold: self.state = 'open' self.opened_at = time.time() raise ``` ### Idempotency key ```typescript async function transfer(idempotencyKey: string, amount: number) { const existing = await db.idempotency.find(idempotencyKey); if (existing) return existing.result; const result = await actualTransfer(amount); await db.idempotency.save({ key: idempotencyKey, result }); return result; } ``` ### Timeout (Promise.race) ```typescript function withTimeout(promise: Promise, ms: number): Promise { return Promise.race([ promise, new Promise((_, rej) => setTimeout(() => rej(new TimeoutError()), ms)), ]); } ``` ### Bulkhead (semaphore) ```python import asyncio class Bulkhead: def __init__(self, max_concurrent=10): self.sem = asyncio.Semaphore(max_concurrent) async def call(self, coro): async with self.sem: return await coro ``` ### DLQ (dead letter queue) ```python def consume_with_dlq(queue, dlq, handler, max_retries=3): for msg in queue: for attempt in range(max_retries): try: handler(msg) queue.ack(msg) break except Exception as e: if attempt == max_retries - 1: dlq.publish(msg, error=str(e)) queue.ack(msg) break ``` ### Saga compensation ```python class Saga: def __init__(self): self.compensations = [] async def execute(self, steps): try: for step, compensation in steps: await step() self.compensations.append(compensation) except Exception: for c in reversed(self.compensations): try: await c() except: pass # 매 best-effort raise # 매 usage saga = Saga() await saga.execute([ (reserve_inventory, lambda: release_inventory()), (charge_card, lambda: refund_card()), (schedule_shipping, lambda: cancel_shipping()), ]) ``` ### Fallback (graceful degrade) ```typescript async function getRecommendations(userId: string) { try { return await mlService.recommend(userId); } catch (e) { log.warn('ML service down, using popular fallback'); return await getPopularItems(); // 매 cached } } ``` ### Temporal durable workflow ```typescript import { proxyActivities, sleep } from '@temporalio/workflow'; const { reserveInventory, chargePayment } = proxyActivities({ startToCloseTimeout: '1m', retry: { maximumAttempts: 5, initialInterval: '1s', backoffCoefficient: 2 }, }); export async function orderWorkflow(orderId: string) { await reserveInventory(orderId); await chargePayment(orderId); await sleep('5m'); // 매 fulfilment delay return 'completed'; } ``` ### Polly (.NET) ```csharp var policy = Policy .Handle() .WaitAndRetryAsync(5, attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt))) .WrapAsync(Policy .Handle() .CircuitBreakerAsync(5, TimeSpan.FromMinutes(1))); await policy.ExecuteAsync(() => httpClient.GetAsync(url)); ``` ### LLM API retry (rate limit aware) ```python def llm_call(prompt, max_retries=5): for attempt in range(max_retries): try: return openai_client.create(prompt=prompt) except RateLimitError as e: wait = e.retry_after if e.retry_after else 2 ** attempt time.sleep(wait) ``` ### Health check + half-open ```python def half_open_probe(circuit): try: result = light_health_check() if result.ok: circuit.state = 'closed' except: pass ``` ### Idempotent HTTP (Stripe-style) ```bash curl -X POST https://api/charges \ -H "Idempotency-Key: $(uuidgen)" \ -d "amount=2000" ``` ### Observability (per attempt) ```python def observed_retry(fn): @wraps(fn) def wrapper(*args, **kwargs): for attempt in range(5): metrics.increment('attempt', {'fn': fn.__name__, 'attempt': attempt}) try: return fn(*args, **kwargs) except Exception as e: metrics.increment('failure', {'fn': fn.__name__, 'attempt': attempt, 'err': type(e).__name__}) raise return wrapper ``` ## 매 결정 기준 | 상황 | Pattern | |---|---| | HTTP 5xx | Retry + backoff + jitter | | External dep flaky | Circuit breaker | | Distributed transaction | Saga + compensation | | Long workflow | Temporal / Inngest | | Unique side effect | Idempotency key | | Rate-limit aware | Retry-After | | User-visible | Fallback + cache | **기본값**: 매 retry exp+jitter + 매 idempotency + 매 timeout + 매 circuit breaker + 매 DLQ + 매 observability. ## 🔗 Graph - 부모: [[Distributed-Systems]] · [[Reliability]] - 변형: [[Circuit-Breaker]] - 응용: [[Temporal]] - Adjacent: [[Idempotency]] · [[Bulkhead]] · [[Event-Driven-Architecture]] ## 🤖 LLM 활용 **언제**: 매 모든 distributed system. 매 external API. 매 long workflow. **언제 X**: 매 deterministic in-process. ## ❌ 안티패턴 - **Retry without backoff**: 매 thunder herd. - **Retry non-idempotent**: 매 duplicate effect. - **Infinite retry**: 매 cascading. - **No DLQ**: 매 lost messages. - **No timeout**: 매 hang. - **No circuit**: 매 cascade failure. ## 🧪 검증 / 중복 - Verified (Release It! Nygard, Temporal docs, Polly docs). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-04-20 | Auto-reinforced | | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — patterns + 매 retry / circuit / saga / Temporal / DLQ code |