id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id
title
category
status
canonical_id
aliases
duplicate_of
source_trust_level
confidence_score
verification_status
tags
raw_sources
last_reinforced
github_commit
tech_stack
wiki-2026-0508-multi-threaded-architecture
Multi-threaded Architecture
10_Wiki/Topics
verified
self
Multithreading
Concurrent Architecture
MT Architecture
none
A
0.9
applied
architecture
concurrency
threading
performance
2026-05-10
pending
language
framework
cpp
stdthread-tbb-rayon
Multi-threaded Architecture
매 한 줄
"매 work를 multiple threads에 분산하여 throughput · responsiveness를 동시에 확보." . 1990s SMP era에서 출발하여 2026 현재 manycore (Apple M4 Max 16-core, AMD Threadripper 96-core), GPU offload, async/await coroutine model이 주류. Game engine · server · ML inference · browser engine 모두 multi-threaded 설계가 default.
매 핵심
매 thread model 종류
OS thread (1:1) : pthread, std::thread — kernel-scheduled, expensive context switch.
Green thread / fiber : Go goroutine, Java 21 virtual thread — userland scheduler, M:N mapping.
Coroutine / async task : C++20 coroutine, Rust async, Kotlin coroutine — stackless, await-resume.
Task-based : Intel TBB, .NET TPL, Apple GCD — work-stealing scheduler, no thread management.
매 architectural pattern
Producer-consumer : bounded queue로 backpressure.
Pipeline : stage별 thread, ring buffer로 연결 (LMAX Disruptor).
Fork-join : divide & conquer, work-stealing.
Actor : 매 message passing (Akka, Erlang, Pony) — no shared state.
Data parallelism : SIMD + thread pool — Rayon par_iter(), OpenMP #pragma omp parallel for.
매 응용
Game engine — render thread + game thread + audio thread + IO thread.
Browser — process-per-tab + GPU process + utility processes.
Database — connection pool + worker threads + background flush.
💻 패턴
Thread pool with work queue (C++20)
Rust Rayon data parallelism
Go goroutine + channel (fan-out / fan-in)
Lock-free SPSC ring buffer
Game engine 3-thread architecture
매 결정 기준
상황
Approach
CPU-bound, divisible work
Rayon / OpenMP / TBB
IO-heavy (10k+ connections)
async/await (Tokio, asyncio, Node)
Real-time game loop
dedicated threads + lock-free queue
Mixed workload
task-based (TBB, GCD)
Simple parallel-for
thread pool + work queue
Distributed across machines
actor (Akka) or message queue
기본값 : task-based scheduler (TBB/Rayon/Tokio) — manual thread management 회피.
🔗 Graph
🤖 LLM 활용
언제 : throughput-critical workload, multi-core utilization, real-time game/server, ML inference batching.
언제 X : simple sequential script, IO-light short-lived task, single-core embedded — 매 overhead 큼.
❌ 안티패턴
Shared mutable state without sync : data race · UB.
Coarse global lock : 매 single-thread보다 느림 (lock contention).
Thread per request (10k+) : stack memory 폭발 — async 또는 thread pool 사용.
busy-wait spin : CPU 100% 소모 — condition variable / semaphore.
False sharing : 같은 cache line의 다른 atomic — alignas(64) cache padding.
🧪 검증 / 중복
Verified (Herb Sutter "The Free Lunch Is Over" 2005, Intel TBB docs, Rust async book 2026).
신뢰도 A.
🕓 Changelog
날짜
변경
2026-05-08
Phase 1
2026-05-10
Manual cleanup — full content (thread models, patterns, decision matrix)