Files
2nd/10_Wiki/Topics/AI_and_ML/Foundation-Models.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

8.6 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-foundation-models Foundation Models 10_Wiki/Topics verified self
foundation model
FM
large pretrained
LLM
VLM
multimodal foundation
GPT
Claude
Gemini
Llama
none A 0.98 applied
ai
foundation-model
llm
vlm
multimodal
scaling-laws
pretraining
2026-05-10 pending
language framework
Python PyTorch / Transformers / vLLM

Foundation Models

매 한 줄

"매 large-scale 의 의 의 self-supervised pretrained 의 의 의 다양한 task 의 의 의 adapt". Bommasani 2021 Stanford term. 매 LLM (GPT, Claude, Gemini, Llama), VLM (GPT-4V, Claude 3, Gemini), 매 audio (Whisper), 매 protein (ESM, AlphaFold), 매 robot (RT-2, π0).

매 핵심

매 traits

  • Scale: 매 billions of parameter.
  • Pretrain: 매 vast unsupervised data.
  • Emergence (debated).
  • Adaptable: 매 prompt / fine-tune / RAG.
  • Foundation 효과: 매 downstream 의 ↑.

매 modality

  • Text: GPT, Claude, Gemini, Llama, Mistral.
  • Vision-Language: GPT-4V, Claude 3, Gemini, LLaVA.
  • Audio: Whisper, AudioLM.
  • Code: Codex, CodeLlama.
  • Protein: ESM-2, AlphaFold-3.
  • Robot: RT-2, OpenVLA, π0.
  • Time-series (TimeGPT).

매 scaling laws

  • Kaplan 2020: 매 power law (loss vs params/data/compute).
  • Chinchilla (Hoffmann 2022): 매 D ≈ 20·N optimal.
  • Modern (2024+): 매 over-train (Llama 3 매 15T tokens).

매 modern (2025-2026)

  • Frontier: Claude Opus 4.7, GPT-5, Gemini 2 Ultra.
  • Open: Llama 3.x, Qwen 2.5, Mistral, DeepSeek-V3.
  • Multimodal: Gemini 1.5 1M context, Claude 3.5.
  • MoE: Mixtral, DeepSeek MoE.
  • Reasoning: o1, o3, DeepSeek-R1.

매 응용

  1. General assistant.
  2. Code.
  3. Domain expert (medical, legal).
  4. Multimodal analysis.
  5. Agent.
  6. Embedding (retrieval, clustering).

💻 패턴

LLM call (Anthropic)

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model='claude-opus-4-7',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': 'Hi'}],
)

Streaming

with client.messages.stream(
    model='claude-opus-4-7',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': prompt}],
) as stream:
    for text in stream.text_stream:
        print(text, end='', flush=True)

Tool use (function calling)

tools = [{
    'name': 'get_weather',
    'description': 'Get weather for a location',
    'input_schema': {'type': 'object', 'properties': {'location': {'type': 'string'}}},
}]
r = client.messages.create(model='claude-opus-4-7', tools=tools, messages=[...])
if r.stop_reason == 'tool_use':
    tool = next(b for b in r.content if b.type == 'tool_use')
    result = execute_tool(tool.name, tool.input)

Vision (multimodal)

import base64
img_b64 = base64.b64encode(open('img.jpg', 'rb').read()).decode()
client.messages.create(model='claude-opus-4-7', max_tokens=1024, messages=[{
    'role': 'user',
    'content': [
        {'type': 'image', 'source': {'type': 'base64', 'media_type': 'image/jpeg', 'data': img_b64}},
        {'type': 'text', 'text': 'What do you see?'},
    ],
}])

Open-source (Hugging Face)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-3.1-70B-Instruct', torch_dtype='bfloat16', device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-3.1-70B-Instruct')
inputs = tokenizer.apply_chat_template([{'role': 'user', 'content': 'Hi'}], return_tensors='pt')
outputs = model.generate(inputs, max_new_tokens=200)

Embedding (text)

from sentence_transformers import SentenceTransformer
m = SentenceTransformer('all-mpnet-base-v2')
vecs = m.encode(['hello', 'world'])

# 매 production
from openai import OpenAI
client = OpenAI()
emb = client.embeddings.create(input='hello', model='text-embedding-3-large').data[0].embedding

CLIP (vision-text)

from transformers import CLIPModel, CLIPProcessor
model = CLIPModel.from_pretrained('openai/clip-vit-large-patch14')
processor = CLIPProcessor.from_pretrained('openai/clip-vit-large-patch14')
inputs = processor(text=['cat', 'dog'], images=image, return_tensors='pt')
outputs = model(**inputs)
similarities = outputs.logits_per_image.softmax(dim=-1)

Foundation model for robotics (OpenVLA)

# 매 example concept
from transformers import AutoProcessor, AutoModelForVision2Seq
processor = AutoProcessor.from_pretrained('openvla/openvla-7b')
model = AutoModelForVision2Seq.from_pretrained('openvla/openvla-7b', torch_dtype=torch.bfloat16)
inputs = processor('In: What action?\nOut:', image, return_tensors='pt')
action = model.predict_action(**inputs, unnorm_key='bridge_orig')

vLLM serving

from vllm import LLM, SamplingParams
llm = LLM(model='meta-llama/Llama-3.1-8B-Instruct', tensor_parallel_size=2)
outputs = llm.generate(prompts, SamplingParams(max_tokens=100, temperature=0.7))

Fine-tune (LoRA)

from peft import LoraConfig, get_peft_model
config = LoraConfig(r=16, lora_alpha=32, target_modules=['q_proj', 'v_proj'])
model = get_peft_model(model, config)
# 매 train on task data

Adapter via prompt

def domain_assistant(question, system_prompt):
    return client.messages.create(
        model='claude-opus-4-7',
        max_tokens=1024,
        system=system_prompt,
        messages=[{'role': 'user', 'content': question}],
    )

medical_system = 'You are a medical expert. Always recommend consulting a physician.'

Caching (prompt cache)

# 매 Anthropic prompt caching
client.messages.create(
    model='claude-opus-4-7',
    max_tokens=1024,
    system=[
        {'type': 'text', 'text': 'You are an expert.', 'cache_control': {'type': 'ephemeral'}},
        {'type': 'text', 'text': long_context, 'cache_control': {'type': 'ephemeral'}},
    ],
    messages=[{'role': 'user', 'content': question}],
)

Agent loop

def agent(goal, tools, max_steps=10):
    history = [{'role': 'user', 'content': goal}]
    for _ in range(max_steps):
        r = client.messages.create(model='claude-opus-4-7', tools=tools, messages=history)
        if r.stop_reason == 'end_turn': return r
        if r.stop_reason == 'tool_use':
            tool_block = next(b for b in r.content if b.type == 'tool_use')
            result = execute(tool_block.name, tool_block.input)
            history.extend([{'role': 'assistant', 'content': r.content}, {'role': 'user', 'content': [{'type': 'tool_result', 'tool_use_id': tool_block.id, 'content': result}]}])

Evaluate (LLM judge)

def llm_judge(response, criteria):
    judge_prompt = f"""Rate this response on {criteria}.
Response: {response}
Output JSON: {{"score": 0-10, "rationale": "..."}}"""
    return json.loads(client.messages.create(model='claude-opus-4-7', max_tokens=200, messages=[{'role': 'user', 'content': judge_prompt}]).content[0].text)

매 결정 기준

상황 Model
Top quality Claude Opus 4.7 / GPT-5
Cost-aware Claude Sonnet / GPT-4o-mini
Open-source Llama 3.x / Qwen 2.5
Code Claude / DeepSeek-Coder
Vision Claude 3.5 / Gemini
Embedding text-embedding-3-large / mpnet
On-device Llama 3.2 1B/3B / Phi-3
Reasoning o1 / DeepSeek-R1

기본값: 매 Frontier API for quality + 매 OSS for cost / control + 매 multimodal where needed + 매 RAG + 매 prompt caching.

🔗 Graph

🤖 LLM 활용

언제: 매 modern AI 의 default. 매 NLP, multimodal, agent, code. 언제 X: 매 strict latency / cost / privacy → smaller / on-device.

안티패턴

  • Largest model always: 매 cost.
  • Fine-tune for facts: 매 RAG 의 better.
  • No eval: 매 quality 의 invisible.
  • Single API lock-in: 매 fallback 의 X.
  • No prompt cache: 매 cost ↑.

🧪 검증 / 중복

  • Verified (Bommasani 2021, Kaplan 2020, Hoffmann 2022, Anthropic / OpenAI / Google docs).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-04-20 Auto-reinforced
2026-05-08 Phase 1
2026-05-10 Manual cleanup — modalities + 매 Anthropic / HF / vLLM / agent / cache code