--- id: wiki-2026-0508-sme title: SME (Subject Matter Expert / Small-Medium Enterprise) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Subject Matter Expert, Small-Medium Enterprise, Domain Expert] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [sme, domain-expert, knowledge-elicitation, business] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python framework: Anthropic Claude API --- # SME (Subject Matter Expert / Small-Medium Enterprise) ## 매 한 줄 > **"매 SME — context dependent: AI/data project 의 SME = 매 domain expert; business/economy 의 SME = 매 small-medium enterprise (typically <250 employees)"**. 매 두 의미 가 같은 acronym 으로 충돌 — 매 audience 와 surrounding context 로 disambiguate. 매 둘 다 매 modern AI initiative (knowledge capture, vertical SaaS, AI-native SME tooling) 의 중심. ## 매 핵심 ### 매 SME = Subject Matter Expert - **역할**: deep domain knowledge — clinical, legal, mechanical, regulatory. - **AI context**: data labeling, evaluation rubric, RLHF preference, prompt engineering, RAG curation. - **Bottleneck**: SME time is the most expensive resource in vertical AI. - **Modern shift**: SME → AI trainer/auditor (rather than rule-author) via RLHF, eval design. ### 매 SME = Small-Medium Enterprise - **EU 정의**: <250 staff, ≤€50M turnover or ≤€43M balance sheet. - **US (SBA)**: varies by NAICS, often <500 employees. - **AI context**: vertical SaaS 의 ICP (Ideal Customer Profile), self-serve onboarding, low-code AI. - **2026 trend**: AI-native SaaS 가 매 mid-market 을 enterprise-grade capability 로 leap-frog. ### 매 응용 1. SME (expert) — RLHF preference labeling, eval rubric authoring. 2. SME (expert) — RAG document curation, golden Q&A creation. 3. SME (business) — vertical SaaS targeting (legal, dental, HVAC). 4. SME (business) — embedded finance, AI bookkeeping (Pilot, Bench). ## 💻 패턴 ### SME knowledge elicitation interview (Claude-driven) ```python import anthropic client = anthropic.Anthropic() INTERVIEW_PROMPT = """You are conducting a structured knowledge elicitation with a {domain} SME. Ask one question at a time. Build a decision tree of their reasoning. After each answer, ask "what edge cases?" and "what would make you change the answer?". Output progressive YAML knowledge graph.""" resp = client.messages.create( model="claude-opus-4-7", max_tokens=2048, system=INTERVIEW_PROMPT.format(domain="cardiology triage"), messages=conversation_history, ) ``` ### SME-driven eval rubric (LLM-as-judge with SME calibration) ```python RUBRIC = """ Score 1-5 on each dimension. SME-provided anchors: - Clinical accuracy (5 = matches AHA guidelines, 1 = harmful) - Citation quality (5 = primary source, 1 = none/hallucinated) - Tone (5 = empathetic clinical, 1 = robotic or alarming) """ def sme_eval(question, answer, sme_anchors): prompt = f"{RUBRIC}\n\nSME examples:\n{sme_anchors}\n\nQ: {question}\nA: {answer}" return claude_judge(prompt) # returns scores + rationale ``` ### Active learning loop with SME (cost-aware) ```python import numpy as np def select_for_sme(pool_unlabeled, model, budget=20): # Uncertainty sampling — SME time is expensive, ask only on edge cases probs = model.predict_proba(pool_unlabeled) entropy = -np.sum(probs * np.log(probs + 1e-9), axis=1) top_k_idx = entropy.argsort()[-budget:] return pool_unlabeled[top_k_idx] # send these to SME ``` ### Vertical SaaS for SME (multi-tenant Postgres RLS) ```sql ALTER TABLE invoices ENABLE ROW LEVEL SECURITY; CREATE POLICY tenant_isolation ON invoices USING (tenant_id = current_setting('app.tenant_id')::uuid); -- App sets per-request: SET app.tenant_id = '123e4567-e89b-12d3-a456-426614174000'; ``` ### SME definition lookup (regulation-aware) ```python SME_DEFINITIONS = { "EU": {"staff_max": 250, "turnover_max_eur_m": 50}, "UK": {"staff_max": 250, "turnover_max_gbp_m": 36}, "US_SBA": {"staff_max": 500}, # varies by NAICS "KR": {"staff_max": 300}, # 중소기업기본법 } def is_sme(jurisdiction, staff, turnover_m): d = SME_DEFINITIONS[jurisdiction] return staff <= d["staff_max"] and turnover_m <= d.get("turnover_max_eur_m", 1e9) ``` ### AI bookkeeping for SME (embedded LLM agent) ```python def categorize_transaction(tx): resp = claude.messages.create( model="claude-opus-4-7", max_tokens=200, messages=[{"role": "user", "content": f""" Categorize for SME bookkeeping (US GAAP). Return JSON. Tx: {tx} Categories: {ALLOWED_GAAP_CATEGORIES} """}], ) return json.loads(resp.content[0].text) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | AI eval design | SME (expert) authoring rubrics, calibrating LLM judge | | RAG curation | SME (expert) curates golden corpus, validates retrieval | | Vertical SaaS GTM | Target SME (business) with self-serve, transparent pricing | | Regulatory SME definition | Use jurisdiction lookup (EU vs US SBA vs KR 중기법) | | Active learning budget | SME (expert) only on high-uncertainty samples | **기본값**: clarify which SME meaning per context; never assume. ## 🔗 Graph - 부모: [[Business-Strategy]] - 변형: [[Domain-Expert]] · [[Startup]] - 응용: [[RLHF]] · [[Active-Learning]] - Adjacent: [[LLM-as-Judge]] · [[RAG]] · [[SaaS]] ## 🤖 LLM 활용 **언제**: SME interview structuring, knowledge graph extraction, eval rubric drafting, SME-time amplification (ask 100 questions LLM-first, escalate to human only on disagreement). **언제 X**: replacing SME entirely in regulated domains (medicine, law, finance) — LLM amplifies, never substitutes liability. ## ❌ 안티패턴 - **Acronym ambiguity**: "let's interview SMEs" in mixed audience → confusion (experts vs companies). - **SME burnout**: dumping all labeling on one SME without active sampling. - **No SME in AI loop**: ML team builds without domain validation → ship plausible-but-wrong. - **Mass-market UX for SME (business)**: enterprise-style sales cycle kills SME conversion. ## 🧪 검증 / 중복 - Verified (EU SME definition 2003/361/EC, US SBA size standards, AIMA RLHF chapter). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — dual SME meanings, knowledge elicitation, vertical SaaS |