Files

T

Antigravity Agent 504fd5fb42 [G1-Sync] Manual knowledge update

2026-05-10 22:08:15 +09:00

6.7 KiB

Raw Blame History

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack

title

OpenAI API Integration

OpenAI 플랫폼 통합. Chat Completions / Responses API, function calling, structured output (JSON Schema), vision/audio multimodal, batch, embeddings, fine-tuning.

핵심

엔드포인트 (2025-2026)

Responses API (신규 권장): tools/state 통합, agentic 워크플로우.
Chat Completions: 레거시 호환, 광범위한 지원.
Embeddings: text-embedding-3-small/large.
Audio: gpt-4o-audio-preview, Whisper STT, TTS.
Images: gpt-image-1 (생성/편집).
Batch: 50% 할인, 24h 내 처리.
Realtime: WebSocket 실시간 음성/텍스트.

모델 (대표)

gpt-5 / gpt-5-mini / gpt-5-nano (2025).
gpt-4.1, gpt-4o, gpt-4o-mini — 멀티모달.
o3, o4-mini — reasoning 모델.
text-embedding-3-large (3072 dim).

핵심 기능

Function calling / tools: 구조화 함수 호출.
Structured Outputs: response_format + JSON Schema = 100% 스키마 준수.
Vision: 이미지 URL 또는 base64 입력.
Audio in/out: gpt-4o-audio (음성 입출력).
Streaming: SSE.
Prompt caching: 자동, 1024+ 토큰 prefix 50% 할인.
Seed / temperature 0: deterministic-ish.

인증/보안

API key (OPENAI_API_KEY), Project keys (RBAC).
Rate limit: tier (Tier 1-5), TPM/RPM.
Org ID 헤더 분리.

💻 패턴

Python — 기본 chat

from openai import OpenAI
client = OpenAI()
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0.7,
)
print(resp.choices[0].message.content)

Structured Output (JSON Schema)

from pydantic import BaseModel
class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

resp = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[{"role": "user", "content": "Pasta carbonara recipe"}],
    response_format=Recipe,
)
recipe: Recipe = resp.choices[0].message.parsed

Function calling (tools)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]
resp = client.chat.completions.create(
    model="gpt-4o", messages=[{"role":"user","content":"Weather in Seoul?"}],
    tools=tools, tool_choice="auto",
)
call = resp.choices[0].message.tool_calls[0]
# dispatch → call.function.name with json.loads(call.function.arguments)

Vision (이미지 입력)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": [
        {"type": "text", "text": "What's in this image?"},
        {"type": "image_url", "image_url": {"url": "https://.../cat.jpg"}},
    ]}],
)

Streaming (SSE)

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user","content":"Write a haiku"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Embeddings + cosine search

import numpy as np
def embed(text):
    return np.array(client.embeddings.create(
        model="text-embedding-3-small", input=text
    ).data[0].embedding)

def cosine(a, b): return float(a @ b / (np.linalg.norm(a)*np.linalg.norm(b)))

Batch API (50% 할인)

# 1) JSONL upload
file = client.files.create(file=open("requests.jsonl","rb"), purpose="batch")
# 2) batch
batch = client.batches.create(
    input_file_id=file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
)
# 3) poll status, download output_file_id when 'completed'

TypeScript — Responses API

import OpenAI from "openai";
const client = new OpenAI();
const resp = await client.responses.create({
  model: "gpt-4.1",
  input: "Summarize Anthropic's mission.",
  tools: [{ type: "web_search" }],
});
console.log(resp.output_text);

Retry + 비용 가드

import time, openai
def safe_call(fn, *a, **kw):
    for i in range(5):
        try: return fn(*a, **kw)
        except openai.RateLimitError: time.sleep(2**i)
        except openai.APIError as e:
            if e.status_code >= 500: time.sleep(2**i)
            else: raise

결정 기준

목적	권장 모델/기능
일반 chat 저비용	gpt-4o-mini
reasoning 강함	o4-mini / o3
멀티모달	gpt-4o
대량 비실시간	Batch + gpt-4o-mini
임베딩 일반	text-embedding-3-small
임베딩 고품질	text-embedding-3-large
실시간 음성	Realtime API + gpt-4o-realtime
스키마 보장	Structured Outputs (response_format)
에이전트	Responses API + tools

기본값: gpt-4o-mini + Structured Outputs + 캐싱.

🔗 Graph

부모: LLM APIs, Generative AI Platforms
변형: Anthropic API, Google Gemini API, Azure OpenAI
응용: RAG, Function Calling, Structured Output, Voice Agents
Adjacent: Prompt Caching, Embeddings, Whisper, Realtime API

🤖 LLM 활용

언제: chat/추출/요약/RAG 합성, function dispatcher, structured 추출, 임베딩 검색, 일괄 분류 (Batch).
언제 X: PHI/PCI raw 데이터 (BAA/규정 확인), 하드 실시간 < 50ms (edge 모델), 결정론적 계산 (코드/계산기로).

❌ 안티패턴

API key 클라이언트 번들에 노출 → 백엔드 프록시 필수.
무한 retry (지수 백오프 없이) → rate limit 폭주.
temperature=0이면 deterministic이라 가정 (근사일 뿐).
스키마 검증 없이 LLM JSON 신뢰 → Structured Outputs 또는 Pydantic 검증.
매 호출마다 시스템 prompt 재구성 → prefix 캐싱 활용 (1024+ 토큰).
Batch에 실시간 요청 보내기 (24h 지연).

🧪 검증 / 중복

검증: golden eval set (precision/recall), 토큰 사용 모니터링, 회귀 테스트 모델 버전 lock.
중복: Function Calling / Structured Output / Embeddings 각 기능 전용 페이지 ↔ 본 문서 통합 인덱스.

🕓 Changelog

2026-05-10: 표준 포맷, Responses API / gpt-5 / Structured Outputs / Batch 추가.

6.7 KiB Raw Blame History