--- id: wiki-2026-0508-다수-팀-협업-환경 title: 다수 팀 협업 환경 category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Multi-team Collaboration, Multi-Team AI Workflow, Cross-team AI Coordination] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [collaboration, multi-team, ai-workflow, ml-platform] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: mlflow-langfuse-argo --- # 다수 팀 협업 환경 ## 매 한 줄 > **"매 multi-team AI 환경은 shared model registry + isolated namespace + cross-team observability 의 triad"**. 2024년 단일 ML team 시대가 끝나고, 2026년 enterprise 의 매 product team 마다 LLM/agent 를 owning. Shared infra (model registry, eval harness, prompt store) 위에 team-isolated workspace (Argo namespace, Langfuse project) 를 결합. ## 매 핵심 ### 매 conflict 영역 - **model versioning**: team A 가 fine-tune 한 Llama-3.3 70B 를 team B 도 쓰는데 update 시 regression. - **prompt drift**: 동일 base prompt 가 team 마다 fork 되어 7가지 variant 공존. - **eval inconsistency**: team 마다 다른 eval set → 비교 불가. - **GPU contention**: H200 cluster 의 fair-share scheduling 부재 시 noisy neighbor. ### 매 governance layer - **Model Registry (MLflow / Weights & Biases)**: canonical model card, semver tag, deprecation policy. - **Prompt Store (Langfuse / Humanloop)**: versioned prompts, A/B winner promotion. - **Eval Harness (Inspect AI / promptfoo)**: shared regression suite — 매 model bump 시 자동 trigger. - **Observability (Langfuse + OpenTelemetry)**: 매 team project 분리, leadership level 의 cross-team dashboard. ### 매 응용 1. Platform team 이 base infrastructure 제공, product team 이 application layer 구축. 2. AI Center of Excellence — 매 quarterly review of cross-team incidents. 3. RACI matrix — model owner / prompt owner / eval owner 명시. ## 💻 패턴 ### MLflow shared registry — team isolation via aliases ```python import mlflow from mlflow import MlflowClient client = MlflowClient(tracking_uri="https://mlflow.corp/") # Platform team registers canonical model mv = client.create_model_version( name="llama-3.3-70b-instruct-finetuned", source="s3://corp-models/llama33-v4/", description="Q2 2026 finetune; eval set v3.2", ) client.set_registered_model_alias( name="llama-3.3-70b-instruct-finetuned", alias="prod-team-search", version=mv.version, ) client.set_registered_model_alias( name="llama-3.3-70b-instruct-finetuned", alias="prod-team-support", version=mv.version, ) # Each team pins its own alias → independent rollout cadence ``` ### Langfuse multi-project prompt versioning ```python from langfuse import Langfuse lf = Langfuse(public_key=PK, secret_key=SK, host="https://langfuse.corp") # Team A creates a prompt lf.create_prompt( name="search/intent-classifier", prompt="You classify user search intent. Categories: {{categories}}.", labels=["production"], # auto-promoted version label config={"model": "claude-opus-4-7", "temperature": 0.0}, ) # Team B compiles the same prompt (linked, not copied) prompt = lf.get_prompt("search/intent-classifier", label="production") compiled = prompt.compile(categories="navigational, informational, transactional") ``` ### Argo Workflows — team-namespaced GPU jobs with priority class ```yaml apiVersion: argoproj.io/v1alpha1 kind: Workflow metadata: generateName: finetune-team-search- namespace: ai-team-search spec: entrypoint: train podGC: { strategy: OnWorkflowSuccess } templates: - name: train priorityClassName: team-search-prod # fair-share scheduling container: image: corp-registry/finetune:cuda12.6 resources: limits: { nvidia.com/gpu: 8, memory: 1Ti } env: - name: MLFLOW_TRACKING_URI value: https://mlflow.corp - name: WANDB_PROJECT value: team-search ``` ### Cross-team eval harness with Inspect AI ```python from inspect_ai import eval_async, Task, task from inspect_ai.dataset import json_dataset from inspect_ai.scorer import model_graded_qa @task def shared_safety_suite(): return Task( dataset=json_dataset("s3://corp-evals/safety-v3.2.jsonl"), scorer=model_graded_qa(model="claude-opus-4-7"), ) # Run across all team-owned models nightly models = [ "team-search/llama-3.3-70b@prod", "team-support/llama-3.3-70b@prod", "team-rec/llama-3.3-70b@prod", ] results = await eval_async(shared_safety_suite, model=models) # Publish to shared dashboard; alert if any team regresses >2% vs last week ``` ### OPA policy gate for model promotion ```rego package modelregistry.promotion deny[msg] { input.action == "promote" input.target_alias == "prod" not input.eval_results.safety_pass_rate >= 0.95 msg := sprintf("safety_pass_rate=%.3f below 0.95", [input.eval_results.safety_pass_rate]) } deny[msg] { input.action == "promote" not input.has_owner_approval msg := "missing model_owner approval" } ``` ### Cross-team incident postmortem template (YAML, version-controlled) ```yaml incident_id: 2026-Q2-013 date: 2026-05-08 owning_team: team-search affected_teams: [team-support, team-rec] root_cause: | team-search rolled new finetune to alias=prod without notifying downstream consumers; intent-classifier prompt assumed older format. detection: Langfuse anomaly (latency p95 spike) — 14 min resolution: rolled back model alias; published deprecation policy action_items: - owner: platform-team due: 2026-05-22 task: enforce 7-day deprecation notice via OPA ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | 2-3 teams, single product | shared monorepo + single MLflow project | | 5-15 teams, mixed maturity | platform team + per-team namespace | | 15+ teams, enterprise | full governance layer + AI CoE + OPA gates | | Regulated (finance/health) | add audit log + immutable model lineage | **기본값**: MLflow registry + Langfuse prompt store + Argo namespace per team + shared Inspect AI eval suite. ## 🔗 Graph - 응용: [[Model_Registry]] · [[Large_Frontend_Projects]] - Adjacent: [[Iterative Prompting]] · [[Parameter]] ## 🤖 LLM 활용 **언제**: 매 enterprise 의 5+ teams 가 LLM/agent product 를 ship 할 때, shared eval/registry 가 미존재할 때. **언제 X**: 매 single team / single model — 매 over-engineering. Notion + GitHub 면 충분. ## ❌ 안티패턴 - **Shadow IT model**: team 이 platform 우회하여 personal HF token 으로 model serving — security/cost blind spot. - **Prompt copy-paste**: Slack 으로 prompt 공유 → drift, no versioning. - **Eval set fragmentation**: team 마다 자체 eval → cross-team comparison 불가. - **No deprecation policy**: alias=prod 의 silent breaking change. - **Single GPU pool, no priority class**: noisy neighbor 가 매 production inference 를 starvation. ## 🧪 검증 / 중복 - Verified (MLflow 2.x docs, Langfuse v3 multi-project, Argo Workflows fair-share scheduling, Inspect AI 0.3+). - 신뢰도 A — 매 production-grade enterprise pattern. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — multi-team AI governance triad + 6 patterns |