Files

T

koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)

이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-08 12:24:15 +09:00

6.3 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

SME (Subject Matter Expert / Small-Medium Enterprise)

매 한 줄

"매 SME — context dependent: AI/data project 의 SME = 매 domain expert; business/economy 의 SME = 매 small-medium enterprise (typically <250 employees)". 매 두 의미 가 같은 acronym 으로 충돌 — 매 audience 와 surrounding context 로 disambiguate. 매 둘 다 매 modern AI initiative (knowledge capture, vertical SaaS, AI-native SME tooling) 의 중심.

매 핵심

매 SME = Subject Matter Expert

역할: deep domain knowledge — clinical, legal, mechanical, regulatory.
AI context: data labeling, evaluation rubric, RLHF preference, prompt engineering, RAG curation.
Bottleneck: SME time is the most expensive resource in vertical AI.
Modern shift: SME → AI trainer/auditor (rather than rule-author) via RLHF, eval design.

매 SME = Small-Medium Enterprise

EU 정의: <250 staff, ≤€50M turnover or ≤€43M balance sheet.
US (SBA): varies by NAICS, often <500 employees.
AI context: vertical SaaS 의 ICP (Ideal Customer Profile), self-serve onboarding, low-code AI.
2026 trend: AI-native SaaS 가 매 mid-market 을 enterprise-grade capability 로 leap-frog.

매 응용

SME (expert) — RLHF preference labeling, eval rubric authoring.
SME (expert) — RAG document curation, golden Q&A creation.
SME (business) — vertical SaaS targeting (legal, dental, HVAC).
SME (business) — embedded finance, AI bookkeeping (Pilot, Bench).

💻 패턴

SME knowledge elicitation interview (Claude-driven)

import anthropic

client = anthropic.Anthropic()
INTERVIEW_PROMPT = """You are conducting a structured knowledge elicitation
with a {domain} SME. Ask one question at a time. Build a decision tree of
their reasoning. After each answer, ask "what edge cases?" and "what would
make you change the answer?". Output progressive YAML knowledge graph."""

resp = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    system=INTERVIEW_PROMPT.format(domain="cardiology triage"),
    messages=conversation_history,
)

SME-driven eval rubric (LLM-as-judge with SME calibration)

RUBRIC = """
Score 1-5 on each dimension. SME-provided anchors:
- Clinical accuracy (5 = matches AHA guidelines, 1 = harmful)
- Citation quality (5 = primary source, 1 = none/hallucinated)
- Tone (5 = empathetic clinical, 1 = robotic or alarming)
"""
def sme_eval(question, answer, sme_anchors):
    prompt = f"{RUBRIC}\n\nSME examples:\n{sme_anchors}\n\nQ: {question}\nA: {answer}"
    return claude_judge(prompt)  # returns scores + rationale

Active learning loop with SME (cost-aware)

import numpy as np

def select_for_sme(pool_unlabeled, model, budget=20):
    # Uncertainty sampling — SME time is expensive, ask only on edge cases
    probs = model.predict_proba(pool_unlabeled)
    entropy = -np.sum(probs * np.log(probs + 1e-9), axis=1)
    top_k_idx = entropy.argsort()[-budget:]
    return pool_unlabeled[top_k_idx]  # send these to SME

Vertical SaaS for SME (multi-tenant Postgres RLS)

ALTER TABLE invoices ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON invoices
USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- App sets per-request:
SET app.tenant_id = '123e4567-e89b-12d3-a456-426614174000';

SME definition lookup (regulation-aware)

SME_DEFINITIONS = {
    "EU": {"staff_max": 250, "turnover_max_eur_m": 50},
    "UK": {"staff_max": 250, "turnover_max_gbp_m": 36},
    "US_SBA": {"staff_max": 500},  # varies by NAICS
    "KR": {"staff_max": 300},      # 중소기업기본법
}
def is_sme(jurisdiction, staff, turnover_m):
    d = SME_DEFINITIONS[jurisdiction]
    return staff <= d["staff_max"] and turnover_m <= d.get("turnover_max_eur_m", 1e9)

AI bookkeeping for SME (embedded LLM agent)

def categorize_transaction(tx):
    resp = claude.messages.create(
        model="claude-opus-4-7",
        max_tokens=200,
        messages=[{"role": "user", "content": f"""
Categorize for SME bookkeeping (US GAAP). Return JSON.
Tx: {tx}
Categories: {ALLOWED_GAAP_CATEGORIES}
"""}],
    )
    return json.loads(resp.content[0].text)

매 결정 기준

상황	Approach
AI eval design	SME (expert) authoring rubrics, calibrating LLM judge
RAG curation	SME (expert) curates golden corpus, validates retrieval
Vertical SaaS GTM	Target SME (business) with self-serve, transparent pricing
Regulatory SME definition	Use jurisdiction lookup (EU vs US SBA vs KR 중기법)
Active learning budget	SME (expert) only on high-uncertainty samples

기본값: clarify which SME meaning per context; never assume.

🔗 Graph

부모: Business-Strategy
변형: Domain-Expert · Startup
응용: RLHF · Active Learning
Adjacent: LLM-as-Judge · RAG · SaaS

🤖 LLM 활용

언제: SME interview structuring, knowledge graph extraction, eval rubric drafting, SME-time amplification (ask 100 questions LLM-first, escalate to human only on disagreement). 언제 X: replacing SME entirely in regulated domains (medicine, law, finance) — LLM amplifies, never substitutes liability.

❌ 안티패턴

Acronym ambiguity: "let's interview SMEs" in mixed audience → confusion (experts vs companies).
SME burnout: dumping all labeling on one SME without active sampling.
No SME in AI loop: ML team builds without domain validation → ship plausible-but-wrong.
Mass-market UX for SME (business): enterprise-style sales cycle kills SME conversion.

🧪 검증 / 중복

Verified (EU SME definition 2003/361/EC, US SBA size standards, AIMA RLHF chapter).
신뢰도 A.

🕓 Changelog

날짜	변경
2026-05-08	Phase 1
2026-05-10	Manual cleanup — dual SME meanings, knowledge elicitation, vertical SaaS

6.3 KiB Raw Blame History