---
id: wiki-2026-0508-openai-api-integration
title: OpenAI API Integration
category: 10_Wiki/Topics
status: verified
canonical_id: self
aliases: [OpenAI API, GPT API, OpenAI SDK Integration]
duplicate_of: none
source_trust_level: A
confidence_score: 0.92
verification_status: applied
tags: [openai, api, gpt, function-calling, structured-output, vision, audio, batch, embeddings]
raw_sources: []
last_reinforced: 2026-05-10
github_commit: pending
tech_stack: { language: python/typescript, framework: openai-sdk }
---

# OpenAI API Integration

OpenAI 플랫폼 통합. Chat Completions / Responses API, function calling, structured output (JSON Schema), vision/audio multimodal, batch, embeddings, fine-tuning.

## 핵심

### 엔드포인트 (2025-2026)
- **Responses API** (신규 권장): tools/state 통합, agentic 워크플로우.
- **Chat Completions**: 레거시 호환, 광범위한 지원.
- **Embeddings**: `text-embedding-3-small/large`.
- **Audio**: `gpt-4o-audio-preview`, Whisper STT, TTS.
- **Images**: `gpt-image-1` (생성/편집).
- **Batch**: 50% 할인, 24h 내 처리.
- **Realtime**: WebSocket 실시간 음성/텍스트.

### 모델 (대표)
- **gpt-5 / gpt-5-mini / gpt-5-nano** (2025).
- **gpt-4.1, gpt-4o, gpt-4o-mini** — 멀티모달.
- **o3, o4-mini** — reasoning 모델.
- **text-embedding-3-large** (3072 dim).

### 핵심 기능
- **Function calling / tools**: 구조화 함수 호출.
- **Structured Outputs**: `response_format` + JSON Schema = 100% 스키마 준수.
- **Vision**: 이미지 URL 또는 base64 입력.
- **Audio in/out**: gpt-4o-audio (음성 입출력).
- **Streaming**: SSE.
- **Prompt caching**: 자동, 1024+ 토큰 prefix 50% 할인.
- **Seed / temperature 0**: deterministic-ish.

### 인증/보안
- API key (`OPENAI_API_KEY`), Project keys (RBAC).
- Rate limit: tier (Tier 1-5), TPM/RPM.
- Org ID 헤더 분리.

## 💻 패턴

### Python — 기본 chat
```python
from openai import OpenAI
client = OpenAI()
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0.7,
)
print(resp.choices[0].message.content)
```

### Structured Output (JSON Schema)
```python
from pydantic import BaseModel
class Recipe(BaseModel):
    name: str
    ingredients: list[str]
    steps: list[str]

resp = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[{"role": "user", "content": "Pasta carbonara recipe"}],
    response_format=Recipe,
)
recipe: Recipe = resp.choices[0].message.parsed
```

### Function calling (tools)
```python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]
resp = client.chat.completions.create(
    model="gpt-4o", messages=[{"role":"user","content":"Weather in Seoul?"}],
    tools=tools, tool_choice="auto",
)
call = resp.choices[0].message.tool_calls[0]
# dispatch → call.function.name with json.loads(call.function.arguments)
```

### Vision (이미지 입력)
```python
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": [
        {"type": "text", "text": "What's in this image?"},
        {"type": "image_url", "image_url": {"url": "https://.../cat.jpg"}},
    ]}],
)
```

### Streaming (SSE)
```python
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user","content":"Write a haiku"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
```

### Embeddings + cosine search
```python
import numpy as np
def embed(text):
    return np.array(client.embeddings.create(
        model="text-embedding-3-small", input=text
    ).data[0].embedding)

def cosine(a, b): return float(a @ b / (np.linalg.norm(a)*np.linalg.norm(b)))
```

### Batch API (50% 할인)
```python
# 1) JSONL upload
file = client.files.create(file=open("requests.jsonl","rb"), purpose="batch")
# 2) batch
batch = client.batches.create(
    input_file_id=file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h",
)
# 3) poll status, download output_file_id when 'completed'
```

### TypeScript — Responses API
```ts
import OpenAI from "openai";
const client = new OpenAI();
const resp = await client.responses.create({
  model: "gpt-4.1",
  input: "Summarize Anthropic's mission.",
  tools: [{ type: "web_search" }],
});
console.log(resp.output_text);
```

### Retry + 비용 가드
```python
import time, openai
def safe_call(fn, *a, **kw):
    for i in range(5):
        try: return fn(*a, **kw)
        except openai.RateLimitError: time.sleep(2**i)
        except openai.APIError as e:
            if e.status_code >= 500: time.sleep(2**i)
            else: raise
```

## 결정 기준

| 목적 | 권장 모델/기능 |
|---|---|
| 일반 chat 저비용 | gpt-4o-mini |
| reasoning 강함 | o4-mini / o3 |
| 멀티모달 | gpt-4o |
| 대량 비실시간 | Batch + gpt-4o-mini |
| 임베딩 일반 | text-embedding-3-small |
| 임베딩 고품질 | text-embedding-3-large |
| 실시간 음성 | Realtime API + gpt-4o-realtime |
| 스키마 보장 | Structured Outputs (response_format) |
| 에이전트 | Responses API + tools |

기본값: **gpt-4o-mini + Structured Outputs + 캐싱**.

## 🔗 Graph

- 부모: [[LLM APIs]], [[Generative AI Platforms]]
- 변형: [[Anthropic API]], [[Google Gemini API]], [[Azure OpenAI]]
- 응용: [[RAG]], [[Function Calling]], [[Structured Output]], [[Voice Agents]]
- Adjacent: [[Prompt Caching]], [[Embeddings]], [[Whisper]], [[Realtime API]]

## 🤖 LLM 활용

- 언제: chat/추출/요약/RAG 합성, function dispatcher, structured 추출, 임베딩 검색, 일괄 분류 (Batch).
- 언제 X: PHI/PCI raw 데이터 (BAA/규정 확인), 하드 실시간 < 50ms (edge 모델), 결정론적 계산 (코드/계산기로).

## ❌ 안티패턴

- API key 클라이언트 번들에 노출 → 백엔드 프록시 필수.
- 무한 retry (지수 백오프 없이) → rate limit 폭주.
- temperature=0이면 deterministic이라 가정 (근사일 뿐).
- 스키마 검증 없이 LLM JSON 신뢰 → Structured Outputs 또는 Pydantic 검증.
- 매 호출마다 시스템 prompt 재구성 → prefix 캐싱 활용 (1024+ 토큰).
- Batch에 실시간 요청 보내기 (24h 지연).

## 🧪 검증 / 중복

- 검증: golden eval set (precision/recall), 토큰 사용 모니터링, 회귀 테스트 모델 버전 lock.
- 중복: [[Function Calling]] / [[Structured Output]] / [[Embeddings]] 각 기능 전용 페이지 ↔ 본 문서 통합 인덱스.

## 🕓 Changelog

- 2026-05-10: 표준 포맷, Responses API / gpt-5 / Structured Outputs / Batch 추가.