d8a80f6272
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해 끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은 과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업. 도구: Datacollect/scripts/link_reconcile_apply.mjs Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
231 lines
6.8 KiB
Markdown
231 lines
6.8 KiB
Markdown
---
|
|
id: wiki-2026-0508-nosql-databases-in-ai
|
|
title: NoSQL Databases in AI
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [NoSQL for AI, Vector DB, Document Store AI]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.9
|
|
verification_status: applied
|
|
tags: [nosql, ai, vector-db, mongodb, redis, rag]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: python
|
|
framework: mongodb-redis-pinecone
|
|
---
|
|
|
|
# NoSQL Databases in AI
|
|
|
|
## 매 한 줄
|
|
> **"매 AI workload = embedding + metadata + cache; 매 NoSQL 의 each layer 의 fit"**. 2026 RAG / agent stack 의 매 standard: vector DB (Pinecone/Qdrant/pgvector) + document store (MongoDB) + KV cache (Redis). 매 schema flexibility + horizontal scale 의 LLM-era natural fit.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 NoSQL family + AI role
|
|
- **Vector**: embedding similarity (RAG, recommendation). Pinecone, Qdrant, Weaviate, Milvus.
|
|
- **Document**: chat history, agent state, structured output. MongoDB, CouchDB.
|
|
- **KV / Cache**: prompt cache, semantic cache, session. Redis, DragonflyDB.
|
|
- **Graph**: knowledge graph, entity link. Neo4j, ArangoDB.
|
|
- **Wide-column**: time-series telemetry, traces. Cassandra, ScyllaDB.
|
|
|
|
### 매 access pattern
|
|
- **Embed + ANN search**: HNSW / IVF index, top-k cosine.
|
|
- **Metadata filter + vector**: hybrid search.
|
|
- **TTL cache**: prompt → response 의 24h cache.
|
|
- **Append-only chat log**: doc store + per-user shard.
|
|
|
|
### 매 응용
|
|
1. RAG: vector + document hybrid.
|
|
2. Agent memory: document (short) + vector (long-term).
|
|
3. Personalization: KV (recent) + graph (relations).
|
|
|
|
## 💻 패턴
|
|
|
|
### Pinecone (managed vector)
|
|
```python
|
|
from pinecone import Pinecone, ServerlessSpec
|
|
|
|
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
|
|
pc.create_index(
|
|
name="docs",
|
|
dimension=1536,
|
|
metric="cosine",
|
|
spec=ServerlessSpec(cloud="aws", region="us-east-1"),
|
|
)
|
|
idx = pc.Index("docs")
|
|
|
|
idx.upsert(vectors=[
|
|
{"id": "doc1", "values": embed("Hello world"),
|
|
"metadata": {"source": "intro.md", "section": "overview"}},
|
|
])
|
|
|
|
res = idx.query(
|
|
vector=embed("greeting example"),
|
|
top_k=5,
|
|
filter={"source": {"$eq": "intro.md"}},
|
|
include_metadata=True,
|
|
)
|
|
```
|
|
|
|
### Qdrant (self-host)
|
|
```python
|
|
from qdrant_client import QdrantClient
|
|
from qdrant_client.models import VectorParams, Distance, PointStruct
|
|
|
|
client = QdrantClient("localhost", port=6333)
|
|
client.recreate_collection(
|
|
"docs",
|
|
vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
|
|
)
|
|
client.upsert("docs", points=[
|
|
PointStruct(id=1, vector=embed(text), payload={"text": text, "source": "x.md"}),
|
|
])
|
|
|
|
hits = client.search("docs", query_vector=embed(q), limit=5,
|
|
query_filter={"must": [{"key": "source", "match": {"value": "x.md"}}]})
|
|
```
|
|
|
|
### pgvector (Postgres + vector)
|
|
```sql
|
|
CREATE EXTENSION vector;
|
|
CREATE TABLE docs (
|
|
id BIGSERIAL PRIMARY KEY,
|
|
text TEXT,
|
|
source TEXT,
|
|
embedding vector(1536)
|
|
);
|
|
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
|
|
|
|
-- hybrid query
|
|
SELECT id, text FROM docs
|
|
WHERE source = 'intro.md'
|
|
ORDER BY embedding <=> $1::vector
|
|
LIMIT 5;
|
|
```
|
|
|
|
### MongoDB Atlas Vector Search
|
|
```javascript
|
|
import { MongoClient } from "mongodb";
|
|
const col = new MongoClient(uri).db("ai").collection("docs");
|
|
|
|
await col.aggregate([
|
|
{ $vectorSearch: {
|
|
index: "doc_embedding",
|
|
path: "embedding",
|
|
queryVector: await embed(q),
|
|
numCandidates: 100,
|
|
limit: 5,
|
|
filter: { source: "intro.md" },
|
|
}},
|
|
{ $project: { text: 1, score: { $meta: "vectorSearchScore" } } },
|
|
]).toArray();
|
|
```
|
|
|
|
### Redis semantic cache
|
|
```python
|
|
import redis, hashlib, json
|
|
r = redis.Redis()
|
|
|
|
def cached_completion(prompt: str, ttl=3600):
|
|
key = "ai:" + hashlib.sha256(prompt.encode()).hexdigest()
|
|
if v := r.get(key): return json.loads(v)
|
|
out = anthropic.messages.create(
|
|
model="claude-opus-4-7",
|
|
messages=[{"role":"user","content":prompt}],
|
|
max_tokens=1024,
|
|
)
|
|
r.setex(key, ttl, json.dumps({"text": out.content[0].text}))
|
|
return {"text": out.content[0].text}
|
|
```
|
|
|
|
### Redis Vector (semantic cache, near-match)
|
|
```python
|
|
# RediSearch + HNSW
|
|
from redis.commands.search.field import VectorField, TextField
|
|
from redis.commands.search.indexDefinition import IndexDefinition
|
|
|
|
r.ft("ai_cache").create_index(
|
|
[TextField("prompt"),
|
|
VectorField("v", "HNSW", {"TYPE":"FLOAT32","DIM":1536,"DISTANCE_METRIC":"COSINE"})],
|
|
definition=IndexDefinition(prefix=["cache:"]),
|
|
)
|
|
```
|
|
|
|
### Agent state (MongoDB)
|
|
```python
|
|
from pymongo import MongoClient
|
|
c = MongoClient(uri).agents.runs
|
|
|
|
run_id = c.insert_one({
|
|
"user_id": "u1",
|
|
"messages": [{"role":"system","content":"..."}],
|
|
"tool_calls": [],
|
|
"status": "running",
|
|
"created_at": datetime.utcnow(),
|
|
}).inserted_id
|
|
|
|
c.update_one({"_id": run_id}, {"$push": {"messages": new_msg}})
|
|
```
|
|
|
|
### Knowledge graph (Neo4j)
|
|
```cypher
|
|
MERGE (a:Person {name: 'Alice'})
|
|
MERGE (c:Company {name: 'Acme'})
|
|
MERGE (a)-[:WORKS_AT {since: 2020}]->(c)
|
|
```
|
|
|
|
```python
|
|
# answer "who at Acme works with Alice's manager?"
|
|
session.run("""
|
|
MATCH (a:Person {name: $n})-[:WORKS_AT]->(c)<-[:WORKS_AT]-(p)
|
|
RETURN p.name LIMIT 10
|
|
""", n="Alice")
|
|
```
|
|
|
|
### Hybrid retrieval (BM25 + vector)
|
|
```python
|
|
keyword_hits = es.search(index="docs", query={"match": {"text": q}})
|
|
vector_hits = idx.query(vector=embed(q), top_k=10).matches
|
|
fused = reciprocal_rank_fusion([keyword_hits, vector_hits], k=60)
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Approach |
|
|
|---|---|
|
|
| Small RAG (<1M docs) | pgvector (single Postgres) |
|
|
| Medium RAG (1M-100M) | Qdrant / Weaviate self-host |
|
|
| Large / managed | Pinecone / MongoDB Atlas |
|
|
| Agent state + chat | MongoDB document store |
|
|
| Prompt cache | Redis (exact + semantic) |
|
|
| Entity reasoning | Neo4j |
|
|
|
|
**기본값**: pgvector (start) → Qdrant (scale) + Redis cache + MongoDB for agent state.
|
|
|
|
## 🔗 Graph
|
|
- 응용: [[RAG]] · [[Semantic Search|Semantic-Search]] · [[Agent-Memory]]
|
|
- Adjacent: [[Embeddings]] · [[Hybrid Search]] · [[pgvector]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 schema design 의 propose, query construction 의 boilerplate, hybrid-search blend 의 tune.
|
|
**언제 X**: 매 capacity / cost projection, ANN index parameter (M, efConstruction) tuning — measure on real workload.
|
|
|
|
## ❌ 안티패턴
|
|
- **Vector DB only**: 매 metadata filter 의 ignore = 매 irrelevant top-k.
|
|
- **No re-ranker**: top-50 vector hits 의 직접 LLM 의 feed = noise. Cohere Rerank or cross-encoder.
|
|
- **Cache prompt verbatim**: 매 1-char diff = 매 cache miss. Use semantic cache.
|
|
- **Mixing OLTP + vector**: 매 single Postgres 의 both = 매 index bloat. Separate.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (Pinecone docs, Qdrant docs, MongoDB Atlas Vector Search 2025, "Designing Data-Intensive Applications", LangChain RAG cookbook).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — NoSQL families mapped to AI/RAG/agent workloads |
|