--- id: messaging-kafka-patterns title: Kafka — Topic / Partition / Consumer Group category: Coding status: draft source_trust_level: B verification_status: conceptual created_at: 2026-05-09 updated_at: 2026-05-09 tags: [messaging, kafka, streaming, vibe-coding] tech_stack: { language: "TS / Kafka", applicable_to: ["Backend"] } applied_in: [] aliases: [Kafka, partitioning, consumer group, offsets, exactly-once, Confluent] --- # Kafka > 이벤트 stream 의 표준. **Topic = 큐, Partition = 병렬 단위, Consumer group = 분산 처리**. Ordering 은 partition 안만. exactly-once 는 idempotent producer + transactional. ## 📖 핵심 개념 - Topic: 분류된 메시지 stream. - Partition: topic 내 분산 unit. ordering 단위. - Consumer group: 같은 group 안 partition 분배. - Offset: consumer 가 어디까지 읽었는지. - Retention: 시간/크기 기반 (vs 일반 큐는 ack 후 삭제). ## 💻 코드 패턴 ### Producer (KafkaJS) ```ts import { Kafka, CompressionTypes } from 'kafkajs'; const kafka = new Kafka({ clientId: 'api', brokers: ['kafka:9092'] }); const producer = kafka.producer({ idempotent: true, // Exactly-once 첫 step maxInFlightRequests: 5, }); await producer.connect(); await producer.send({ topic: 'orders', compression: CompressionTypes.GZIP, messages: [ { key: order.userId, // partition key — 같은 user = 같은 partition (ordering) value: JSON.stringify(order), headers: { 'x-event-id': eventId, 'content-type': 'application/json' }, }, ], }); ``` ### Consumer ```ts const consumer = kafka.consumer({ groupId: 'order-projector' }); await consumer.connect(); await consumer.subscribe({ topic: 'orders', fromBeginning: false }); await consumer.run({ eachMessage: async ({ topic, partition, message }) => { const body = JSON.parse(message.value!.toString()); const eventId = message.headers!['x-event-id']!.toString(); // Idempotency if (await db.processed.exists(eventId)) return; await handleOrder(body); await db.processed.insert({ id: eventId }); // offset 자동 commit (auto) }, }); ``` ### Manual offset commit (exactly-once style) ```ts await consumer.run({ autoCommit: false, eachBatch: async ({ batch, resolveOffset, heartbeat, commitOffsetsIfNecessary }) => { for (const msg of batch.messages) { await db.transaction(async (tx) => { await handleOrderInTx(tx, JSON.parse(msg.value!.toString())); // tx commit + offset commit 분리 — outbox/idempotency 로 보완 }); resolveOffset(msg.offset); await heartbeat(); } await commitOffsetsIfNecessary(); }, }); ``` ### Topic 설정 ```bash kafka-topics --create --topic orders --partitions 12 --replication-factor 3 \ --config min.insync.replicas=2 \ --config retention.ms=604800000 \ # 7일 --config compression.type=gzip ``` ### Schema Registry + Avro/Protobuf ```ts import { SchemaRegistry } from '@kafkajs/confluent-schema-registry'; const registry = new SchemaRegistry({ host: 'http://schema-registry:8081' }); const id = await registry.getRegistryIdBySubject({ subject: 'orders-value' }); await producer.send({ topic: 'orders', messages: [{ value: await registry.encode(id, order) }], }); ``` 호환성 정책: BACKWARD (구 consumer 가 새 메시지 OK). ### Dead-letter topic ```ts async function handleWithDLQ(msg: KafkaMessage) { try { await handle(msg); } catch (e) { await producer.send({ topic: 'orders.dlq', messages: [{ key: msg.key, value: msg.value, headers: { ...msg.headers, 'x-error': String(e) } }], }); } } ``` ### Compaction (key 별 최신만 보존) ```bash --config cleanup.policy=compact ``` key 별로 latest value 만 — user_state 같은 use case. ### Streams (Kafka Streams 또는 Faust / KSQL) - Topic → Topic 변환 / 집계. - Stateful (window aggregations). - Java/Scala 가 1급, TS 는 제한적. ## 🤔 의사결정 기준 | 상황 | 추천 | |---|---| | 큰 throughput + 영속 | Kafka | | 단순 큐 | RabbitMQ / SQS / NATS | | 작은 팀 | NATS JetStream (가벼움) | | Self-host 어려움 | Confluent Cloud / AWS MSK / Redpanda | | Order 강 보장 | partition key | | Replay 필요 | Kafka 자연 (retention) | | Schema 진화 | Schema Registry | ## ❌ 안티패턴 - **Partition 없이 1개**: 병렬 X. - **너무 많은 partition (1000+)**: open file 폭발. - **Key 무작위**: ordering 깨짐. - **autoCommit 만 + 처리 실패**: offset commit 됐는데 처리 안 됨. - **Idempotency 없는 consumer**: at-least-once 가 중복. - **Replication factor 1**: 노드 죽으면 데이터 잃음. - **min.insync.replicas 없음**: split-brain 시 데이터 손실. - **Long processing in eachMessage**: heartbeat 끊겨 rebalance. ## 🤖 LLM 활용 힌트 - Partition key = ordering 단위. - Idempotency 헤더 + dedupe table. - Schema registry + DLQ 표준. ## 🔗 관련 문서 - [[Messaging_NATS_RabbitMQ_Comparison]] - [[Messaging_Exactly_Once]]