--- id: wiki-2026-0508-api-gateway title: API Gateway category: 10_Wiki/Topics status: verified canonical_id: self aliases: [API GW, Gateway pattern, Edge service] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [architecture, microservices, api, gateway, edge] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: yaml framework: Kong, AWS API Gateway, Envoy --- # API Gateway ## 매 한 줄 > **"매 single entry point — fan-out, auth, rate limit"**. 매 microservices 의 클라이언트 facing facade. 매 Netflix Zuul (2013) 시작 → Kong (2015) → Envoy/Istio (2017) → AWS API Gateway HTTP API (2019). 매 2026 modern stack 은 Envoy + xDS control plane, edge AI inference gateway (LiteLLM, Portkey) 의 추가. ## 매 핵심 ### 매 책임 - **Routing**: path/host/header → upstream service. - **Auth/AuthZ**: JWT validation, OAuth2 introspection, mTLS termination. - **Rate limiting**: per-key, per-IP, sliding window. - **Observability**: trace propagation (W3C Trace Context), metrics, access log. - **Transformation**: request/response shaping, protocol translation (REST↔gRPC). ### 매 NOT 책임 - Business logic — 매 service 의 책임. - Data persistence — 매 stateless edge. - Heavy aggregation — 매 BFF (Backend-for-Frontend) layer 의 책임. ### 매 응용 1. **Public API edge** — Stripe, Twilio 형 SaaS API. 2. **BFF per client** — mobile/web/CLI 매 다른 shape. 3. **LLM gateway** — multi-provider routing (Claude, GPT, local), fallback, cost cap. ## 💻 패턴 ### Kong declarative config ```yaml _format_version: "3.0" services: - name: orders-api url: http://orders.svc.cluster.local:8080 routes: - name: orders-route paths: ["/api/orders"] strip_path: false plugins: - name: rate-limiting config: { minute: 600, policy: redis } - name: jwt config: { key_claim_name: kid } - name: prometheus ``` ### Envoy route config ```yaml route_config: virtual_hosts: - name: api domains: ["api.example.com"] routes: - match: { prefix: "/v1/orders" } route: cluster: orders_cluster timeout: 5s retry_policy: retry_on: 5xx,reset,connect-failure num_retries: 2 per_try_timeout: 1s ``` ### AWS API Gateway HTTP API + Lambda authorizer ```yaml # SAM template HttpApi: Type: AWS::Serverless::HttpApi Properties: Auth: Authorizers: JwtAuth: IdentitySource: $request.header.Authorization JwtConfiguration: issuer: https://auth.example.com audience: [api.example.com] DefaultAuthorizer: JwtAuth RouteSettings: "POST /orders": ThrottlingBurstLimit: 100 ThrottlingRateLimit: 50 ``` ### LLM gateway (Portkey-style fallback) ```python from portkey_ai import Portkey client = Portkey( api_key="...", config={ "strategy": {"mode": "fallback"}, "targets": [ {"provider": "anthropic", "override_params": {"model": "claude-opus-4-7"}}, {"provider": "openai", "override_params": {"model": "gpt-5"}}, ], "cache": {"mode": "semantic", "max_age": 3600}, }, ) resp = client.chat.completions.create(messages=[{"role":"user","content":"hi"}]) ``` ### Rate limit (token bucket, Redis) ```lua -- Kong-style Redis Lua local key = "rl:" .. consumer_id local tokens = tonumber(redis.call("GET", key) or "100") if tokens <= 0 then return 429 end redis.call("DECR", key) redis.call("EXPIRE", key, 60) return 200 ``` ### Header-based canary ```yaml routes: - match: prefix: "/v1/checkout" headers: [{name: "x-canary", exact_match: "true"}] route: { cluster: checkout_v2 } - match: { prefix: "/v1/checkout" } route: { cluster: checkout_v1 } ``` ### gRPC-Web transcoding ```yaml http_filters: - name: envoy.filters.http.grpc_web - name: envoy.filters.http.grpc_json_transcoder typed_config: proto_descriptor: /etc/proto/api.pb services: ["api.OrderService"] ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Public SaaS API, multi-tenant | Kong / AWS API Gateway | | Service mesh edge ingress | Envoy + Istio Gateway | | Single-team internal API | Skip gateway → direct service + library SDK | | Multi-LLM provider | Portkey / LiteLLM gateway | | Heterogeneous protocols (REST+gRPC+WS) | Envoy with transcoding filters | **기본값**: 매 Envoy-based (Istio Gateway / Contour) 의 in-cluster, AWS API Gateway 의 fully managed edge. ## 🔗 Graph - 부모: [[Microservices]] · [[Edge Computing]] - 변형: [[Service Mesh]] · [[Reverse Proxy]] - 응용: [[Rate Limiting]] · [[mTLS]] - Adjacent: [[Load Balancer]] · [[CDN]] ## 🤖 LLM 활용 **언제**: 매 multi-service public API, 매 cross-cutting concerns (auth/rate-limit/observability) 의 centralization, 매 multi-provider LLM routing. **언제 X**: 매 single monolith, 매 internal service-to-service only (use mesh sidecar), 매 hot path 의 < 100us latency 요구. ## ❌ 안티패턴 - **Smart gateway**: 매 business logic 의 gateway 에 stuff — 매 deployment coupling 의 발생. - **Single gateway for all clients**: 매 mobile/web/partner 매 BFF 의 분리 안 함 → over-fetching. - **No timeout/retry budget**: 매 cascading failure 의 발생. - **Auth-only gateway, no rate limit**: 매 abuse vector. ## 🧪 검증 / 중복 - Verified (Kong docs, Envoy docs, AWS API Gateway docs, Microsoft Azure Architecture Center "Gateway Aggregation" pattern). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — full content (Kong/Envoy/AWS/LLM gateway patterns) | ## 🛠️ 적용 사례 (Applied in summary) ### 🔎 코드베이스 근거 (자동 추출 — E:\Wiki 레포) **실제 구현/사용 위치:** - `connectai/src/features/secondBrainTrace.ts:256` — [Omitted long matching line] _자동 생성: code_grounding.mjs · 재실행 시 갱신됨_