f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.8 KiB
6.8 KiB
id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
| id | title | category | status | canonical_id | aliases | duplicate_of | source_trust_level | confidence_score | verification_status | tags | raw_sources | last_reinforced | github_commit | tech_stack | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| wiki-2026-0508-generative-ai | Generative AI | 10_Wiki/Topics | verified | self |
|
none | A | 0.98 | applied |
|
2026-05-10 | pending |
|
Generative AI
매 한 줄
"매 새로운 content 의 의 의 의 model — 매 text, image, audio, video, code, 3D". 매 modern: Claude, GPT, Gemini, Llama (text), Midjourney/DALL-E/SD (image), Suno (audio), Sora/Veo (video). 매 transformer + diffusion 의 dominant.
매 핵심
매 modality
- Text: GPT, Claude, Gemini, Llama.
- Image: Stable Diffusion, Midjourney, DALL-E 3, FLUX, Imagen 3.
- Video: Sora (OpenAI), Veo (Google), Runway, Pika.
- Audio / Music: Suno, Udio, MusicLM.
- Speech: ElevenLabs, OpenAI TTS, Whisper (STT).
- 3D: Meshy, Tripo, Luma Genie.
- Code: Codex, CodeLlama, Claude.
매 architecture
- Transformer (text, code).
- Diffusion (image, video, audio).
- Latent diffusion (SD).
- DiT (Diffusion Transformer): SD3, Sora, FLUX.
- Mamba / SSM (emerging).
매 modern (2025-2026)
- Frontier: Claude Opus 4.7, GPT-5, Gemini 2 Ultra.
- Open: Llama 3.x, Qwen 2.5, FLUX.
- Multimodal: Sora, Veo 2, Genie 2.
- Reasoning: o1, o3, R1.
매 응용
- Productivity: writing, coding.
- Creative: art, music, video.
- Customer service: chatbot.
- Education: tutor.
- Marketing: ad copy, image.
- Research: literature review.
- Game: NPC, content.
매 risk
- Hallucination.
- Copyright (training, output).
- Misinformation (deepfake).
- Bias.
- Energy use.
- Job displacement.
💻 패턴
Text generation (Claude)
from anthropic import Anthropic
client = Anthropic()
r = client.messages.create(model='claude-opus-4-7', max_tokens=1024,
messages=[{'role': 'user', 'content': 'Write a haiku about AI'}])
Image (Stable Diffusion)
from diffusers import StableDiffusionXLPipeline
pipe = StableDiffusionXLPipeline.from_pretrained('stabilityai/sdxl-turbo', torch_dtype=torch.float16).to('cuda')
img = pipe('a sunset over mountains', num_inference_steps=4).images[0]
FLUX (modern)
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained('black-forest-labs/FLUX.1-schnell', torch_dtype=torch.bfloat16).to('cuda')
img = pipe('photorealistic forest', num_inference_steps=4).images[0]
Video (Sora-like)
# 매 OpenAI Sora API (when available)
client.videos.generate(model='sora-1', prompt='a cat playing piano', duration_s=10)
Audio (Suno-like)
# 매 commercial APIs
suno_client.generate_song(prompt='upbeat synth-pop', duration_s=180)
TTS (ElevenLabs)
import elevenlabs
audio = elevenlabs.generate(text='Hello world', voice='Adam', model='eleven_multilingual_v2')
3D (Tripo / Meshy)
# 매 image → 3D model
mesh = tripo_client.image_to_mesh('input.png')
mesh.save('output.glb')
Multimodal (Claude vision)
client.messages.create(model='claude-opus-4-7', max_tokens=1024,
messages=[{'role': 'user', 'content': [
{'type': 'image', 'source': {'type': 'base64', 'media_type': 'image/jpeg', 'data': img_b64}},
{'type': 'text', 'text': 'Describe this image'},
]}])
Agent (multi-step)
def agent_loop(goal, tools, max_steps=10):
history = [{'role': 'user', 'content': goal}]
for _ in range(max_steps):
r = client.messages.create(model='claude-opus-4-7', tools=tools, messages=history)
if r.stop_reason == 'end_turn': return r
# 매 execute tool, append result
Watermark (C2PA)
from c2pa import Signer
Signer(cert).sign('output.png', claims={
'generator': 'AI', 'model': 'flux-1-schnell',
})
Prompt engineering
def well_formed_prompt(task, context, examples=[], format='json'):
return f"""## Context
{context}
## Examples
{format_examples(examples)}
## Task
{task}
## Output format
{format}"""
RAG-augmented gen
def rag_generate(question, retriever, llm):
docs = retriever.retrieve(question, k=5)
context = '\n'.join(d.text for d in docs)
return llm.generate(f"Context:\n{context}\n\nQuestion: {question}\nAnswer with citations:")
Fine-tune (LoRA)
from peft import LoraConfig, get_peft_model
config = LoraConfig(r=16, lora_alpha=32, target_modules=['q_proj', 'v_proj'])
model = get_peft_model(base_model, config)
# 매 train on task data
Generation cost monitoring
def cost_track(usage):
pricing = {'claude-opus-4-7': {'in': 15/1e6, 'out': 75/1e6}}
cost = usage.input_tokens * pricing[model]['in'] + usage.output_tokens * pricing[model]['out']
return cost
Eval (LLM-as-judge)
def llm_judge(output, criteria):
prompt = f'Rate {criteria}. Response: {output}. Output JSON: score 0-10.'
return json.loads(judge.generate(prompt))['score']
Brand safety
def brand_safe(output):
return classify_toxicity(output) < 0.05 and not has_competitor(output) and has_brand_voice(output)
매 결정 기준
| 상황 | Tool |
|---|---|
| Best text quality | Claude Opus 4.7 / GPT-5 |
| Cost-aware text | Claude Sonnet / GPT-4o-mini |
| Best image | FLUX / Midjourney v7 |
| Fast image | SDXL Turbo / FLUX schnell |
| Video | Sora / Veo 2 |
| Audio | Suno / ElevenLabs |
| Local | Llama 3.x + SDXL local |
| Code | Claude / Codex |
기본값: 매 frontier API + 매 RAG + 매 prompt eng + 매 LLM-judge eval + 매 brand safety + 매 cost track.
🔗 Graph
- 부모: AI · Foundation-Models
- 변형: Transformer_Architecture_and_LLM_Foundations · Diffusion-Models · Multimodal-LLM
- 응용: Generative-Adversarial-Networks · Stable-Diffusion
- Adjacent: RAG · Fine-tuning · Prompt_Engineering · Ethics & AI
🤖 LLM 활용
언제: 매 모든 productivity, creative, customer-facing. 언제 X: 매 deterministic compute. 매 IP-strict (with care).
❌ 안티패턴
- Hallucination 의 ship: 매 verify.
- No watermark: 매 misinformation.
- No copyright check: 매 legal risk.
- Single model lock-in: 매 API down → outage.
- No cost monitoring: 매 bill shock.
🧪 검증 / 중복
- Verified (Anthropic, OpenAI, Stability, Google docs).
- 신뢰도 A.
🕓 Changelog
| 날짜 | 변경 |
|---|---|
| 2026-04-20 | Auto |
| 2026-05-08 | Phase 1 |
| 2026-05-10 | Manual cleanup — modalities + 매 text / image / video / audio / 3D / agent code |