2nd/10_Wiki/Topics/Architecture/API Gateway.md

---
id: wiki-2026-0508-api-gateway
title: API Gateway
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [API GW, Gateway pattern, Edge service]
duplicate_of: none
source_trust_level: A
confidence_score: 0.9
verification_status: applied
tags: [architecture, microservices, api, gateway, edge]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack:
  language: yaml
  framework: Kong, AWS API Gateway, Envoy
---

# API Gateway

## 매 한 줄
> **"매 single entry point — fan-out, auth, rate limit"**. 매 microservices 의 클라이언트 facing facade. 매 Netflix Zuul (2013) 시작 → Kong (2015) → Envoy/Istio (2017) → AWS API Gateway HTTP API (2019). 매 2026 modern stack 은 Envoy + xDS control plane, edge AI inference gateway (LiteLLM, Portkey) 의 추가.

## 매 핵심

### 매 책임
- **Routing**: path/host/header → upstream service.
- **Auth/AuthZ**: JWT validation, OAuth2 introspection, mTLS termination.
- **Rate limiting**: per-key, per-IP, sliding window.
- **Observability**: trace propagation (W3C Trace Context), metrics, access log.
- **Transformation**: request/response shaping, protocol translation (REST↔gRPC).

### 매 NOT 책임
- Business logic — 매 service 의 책임.
- Data persistence — 매 stateless edge.
- Heavy aggregation — 매 BFF (Backend-for-Frontend) layer 의 책임.

### 매 응용
1. **Public API edge** — Stripe, Twilio 형 SaaS API.
2. **BFF per client** — mobile/web/CLI 매 다른 shape.
3. **LLM gateway** — multi-provider routing (Claude, GPT, local), fallback, cost cap.

## 💻 패턴

### Kong declarative config
```yaml
_format_version: "3.0"
services:
  - name: orders-api
    url: http://orders.svc.cluster.local:8080
    routes:
      - name: orders-route
        paths: ["/api/orders"]
        strip_path: false
    plugins:
      - name: rate-limiting
        config: { minute: 600, policy: redis }
      - name: jwt
        config: { key_claim_name: kid }
      - name: prometheus
```

### Envoy route config
```yaml
route_config:
  virtual_hosts:
    - name: api
      domains: ["api.example.com"]
      routes:
        - match: { prefix: "/v1/orders" }
          route:
            cluster: orders_cluster
            timeout: 5s
            retry_policy:
              retry_on: 5xx,reset,connect-failure
              num_retries: 2
              per_try_timeout: 1s
```

### AWS API Gateway HTTP API + Lambda authorizer
```yaml
# SAM template
HttpApi:
  Type: AWS::Serverless::HttpApi
  Properties:
    Auth:
      Authorizers:
        JwtAuth:
          IdentitySource: $request.header.Authorization
          JwtConfiguration:
            issuer: https://auth.example.com
            audience: [api.example.com]
      DefaultAuthorizer: JwtAuth
    RouteSettings:
      "POST /orders":
        ThrottlingBurstLimit: 100
        ThrottlingRateLimit: 50
```

### LLM gateway (Portkey-style fallback)
```python
from portkey_ai import Portkey

client = Portkey(
    api_key="...",
    config={
        "strategy": {"mode": "fallback"},
        "targets": [
            {"provider": "anthropic", "override_params": {"model": "claude-opus-4-7"}},
            {"provider": "openai",    "override_params": {"model": "gpt-5"}},
        ],
        "cache": {"mode": "semantic", "max_age": 3600},
    },
)
resp = client.chat.completions.create(messages=[{"role":"user","content":"hi"}])
```

### Rate limit (token bucket, Redis)
```lua
-- Kong-style Redis Lua
local key = "rl:" .. consumer_id
local tokens = tonumber(redis.call("GET", key) or "100")
if tokens <= 0 then return 429 end
redis.call("DECR", key)
redis.call("EXPIRE", key, 60)
return 200
```

### Header-based canary
```yaml
routes:
  - match:
      prefix: "/v1/checkout"
      headers: [{name: "x-canary", exact_match: "true"}]
    route: { cluster: checkout_v2 }
  - match: { prefix: "/v1/checkout" }
    route: { cluster: checkout_v1 }
```

### gRPC-Web transcoding
```yaml
http_filters:
  - name: envoy.filters.http.grpc_web
  - name: envoy.filters.http.grpc_json_transcoder
    typed_config:
      proto_descriptor: /etc/proto/api.pb
      services: ["api.OrderService"]
```

## 매 결정 기준
| 상황 | Approach |
|---|---|
| Public SaaS API, multi-tenant | Kong / AWS API Gateway |
| Service mesh edge ingress | Envoy + Istio Gateway |
| Single-team internal API | Skip gateway → direct service + library SDK |
| Multi-LLM provider | Portkey / LiteLLM gateway |
| Heterogeneous protocols (REST+gRPC+WS) | Envoy with transcoding filters |

**기본값**: 매 Envoy-based (Istio Gateway / Contour) 의 in-cluster, AWS API Gateway 의 fully managed edge.

## 🔗 Graph
- 부모: [[Microservices]] · [[Edge Computing]]
- 변형: [[Service Mesh]] · [[Reverse Proxy]]
- 응용: [[Rate Limiting]] · [[mTLS]]
- Adjacent: [[Load Balancer]] · [[CDN]]

## 🤖 LLM 활용
**언제**: 매 multi-service public API, 매 cross-cutting concerns (auth/rate-limit/observability) 의 centralization, 매 multi-provider LLM routing.
**언제 X**: 매 single monolith, 매 internal service-to-service only (use mesh sidecar), 매 hot path 의 < 100us latency 요구.

## ❌ 안티패턴
- **Smart gateway**: 매 business logic 의 gateway 에 stuff — 매 deployment coupling 의 발생.
- **Single gateway for all clients**: 매 mobile/web/partner 매 BFF 의 분리 안 함 → over-fetching.
- **No timeout/retry budget**: 매 cascading failure 의 발생.
- **Auth-only gateway, no rate limit**: 매 abuse vector.

## 🧪 검증 / 중복
- Verified (Kong docs, Envoy docs, AWS API Gateway docs, Microsoft Azure Architecture Center "Gateway Aggregation" pattern).
- 신뢰도 A.

## 🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — full content (Kong/Envoy/AWS/LLM gateway patterns) |