Files
2nd/10_Wiki/Topics/Architecture/NoSQL-Databases-in-AI.md
T
Antigravity Agent f8b21af4be Wiki cleanup: error-doc removal, dedup merge, link normalization
10_Wiki/Topics 대규모 정리:
- 오류 캡처/미완성 stub 문서 227개 제거
- 교차폴더 중복 43클러스터 병합 (63파일 → redirect)
- 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건
- 카테고리 MOC 6개 신규 생성
- Graph 섹션 미해결 related-keyword 링크 10,058건 제거

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 23:52:15 +09:00

6.8 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-nosql-databases-in-ai NoSQL Databases in AI 10_Wiki/Topics verified self
NoSQL for AI
Vector DB
Document Store AI
none A 0.9 applied
nosql
ai
vector-db
mongodb
redis
rag
2026-05-10 pending
language framework
python mongodb-redis-pinecone

NoSQL Databases in AI

매 한 줄

"매 AI workload = embedding + metadata + cache; 매 NoSQL 의 each layer 의 fit". 2026 RAG / agent stack 의 매 standard: vector DB (Pinecone/Qdrant/pgvector) + document store (MongoDB) + KV cache (Redis). 매 schema flexibility + horizontal scale 의 LLM-era natural fit.

매 핵심

매 NoSQL family + AI role

  • Vector: embedding similarity (RAG, recommendation). Pinecone, Qdrant, Weaviate, Milvus.
  • Document: chat history, agent state, structured output. MongoDB, CouchDB.
  • KV / Cache: prompt cache, semantic cache, session. Redis, DragonflyDB.
  • Graph: knowledge graph, entity link. Neo4j, ArangoDB.
  • Wide-column: time-series telemetry, traces. Cassandra, ScyllaDB.

매 access pattern

  • Embed + ANN search: HNSW / IVF index, top-k cosine.
  • Metadata filter + vector: hybrid search.
  • TTL cache: prompt → response 의 24h cache.
  • Append-only chat log: doc store + per-user shard.

매 응용

  1. RAG: vector + document hybrid.
  2. Agent memory: document (short) + vector (long-term).
  3. Personalization: KV (recent) + graph (relations).

💻 패턴

Pinecone (managed vector)

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
pc.create_index(
    name="docs",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
)
idx = pc.Index("docs")

idx.upsert(vectors=[
    {"id": "doc1", "values": embed("Hello world"),
     "metadata": {"source": "intro.md", "section": "overview"}},
])

res = idx.query(
    vector=embed("greeting example"),
    top_k=5,
    filter={"source": {"$eq": "intro.md"}},
    include_metadata=True,
)

Qdrant (self-host)

from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

client = QdrantClient("localhost", port=6333)
client.recreate_collection(
    "docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
client.upsert("docs", points=[
    PointStruct(id=1, vector=embed(text), payload={"text": text, "source": "x.md"}),
])

hits = client.search("docs", query_vector=embed(q), limit=5,
                    query_filter={"must": [{"key": "source", "match": {"value": "x.md"}}]})

pgvector (Postgres + vector)

CREATE EXTENSION vector;
CREATE TABLE docs (
  id BIGSERIAL PRIMARY KEY,
  text TEXT,
  source TEXT,
  embedding vector(1536)
);
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

-- hybrid query
SELECT id, text FROM docs
WHERE source = 'intro.md'
ORDER BY embedding <=> $1::vector
LIMIT 5;
import { MongoClient } from "mongodb";
const col = new MongoClient(uri).db("ai").collection("docs");

await col.aggregate([
  { $vectorSearch: {
      index: "doc_embedding",
      path: "embedding",
      queryVector: await embed(q),
      numCandidates: 100,
      limit: 5,
      filter: { source: "intro.md" },
  }},
  { $project: { text: 1, score: { $meta: "vectorSearchScore" } } },
]).toArray();

Redis semantic cache

import redis, hashlib, json
r = redis.Redis()

def cached_completion(prompt: str, ttl=3600):
    key = "ai:" + hashlib.sha256(prompt.encode()).hexdigest()
    if v := r.get(key): return json.loads(v)
    out = anthropic.messages.create(
        model="claude-opus-4-7",
        messages=[{"role":"user","content":prompt}],
        max_tokens=1024,
    )
    r.setex(key, ttl, json.dumps({"text": out.content[0].text}))
    return {"text": out.content[0].text}

Redis Vector (semantic cache, near-match)

# RediSearch + HNSW
from redis.commands.search.field import VectorField, TextField
from redis.commands.search.indexDefinition import IndexDefinition

r.ft("ai_cache").create_index(
    [TextField("prompt"),
     VectorField("v", "HNSW", {"TYPE":"FLOAT32","DIM":1536,"DISTANCE_METRIC":"COSINE"})],
    definition=IndexDefinition(prefix=["cache:"]),
)

Agent state (MongoDB)

from pymongo import MongoClient
c = MongoClient(uri).agents.runs

run_id = c.insert_one({
    "user_id": "u1",
    "messages": [{"role":"system","content":"..."}],
    "tool_calls": [],
    "status": "running",
    "created_at": datetime.utcnow(),
}).inserted_id

c.update_one({"_id": run_id}, {"$push": {"messages": new_msg}})

Knowledge graph (Neo4j)

MERGE (a:Person {name: 'Alice'})
MERGE (c:Company {name: 'Acme'})
MERGE (a)-[:WORKS_AT {since: 2020}]->(c)
# answer "who at Acme works with Alice's manager?"
session.run("""
MATCH (a:Person {name: $n})-[:WORKS_AT]->(c)<-[:WORKS_AT]-(p)
RETURN p.name LIMIT 10
""", n="Alice")

Hybrid retrieval (BM25 + vector)

keyword_hits = es.search(index="docs", query={"match": {"text": q}})
vector_hits = idx.query(vector=embed(q), top_k=10).matches
fused = reciprocal_rank_fusion([keyword_hits, vector_hits], k=60)

매 결정 기준

상황 Approach
Small RAG (<1M docs) pgvector (single Postgres)
Medium RAG (1M-100M) Qdrant / Weaviate self-host
Large / managed Pinecone / MongoDB Atlas
Agent state + chat MongoDB document store
Prompt cache Redis (exact + semantic)
Entity reasoning Neo4j

기본값: pgvector (start) → Qdrant (scale) + Redis cache + MongoDB for agent state.

🔗 Graph

🤖 LLM 활용

언제: 매 schema design 의 propose, query construction 의 boilerplate, hybrid-search blend 의 tune. 언제 X: 매 capacity / cost projection, ANN index parameter (M, efConstruction) tuning — measure on real workload.

안티패턴

  • Vector DB only: 매 metadata filter 의 ignore = 매 irrelevant top-k.
  • No re-ranker: top-50 vector hits 의 직접 LLM 의 feed = noise. Cohere Rerank or cross-encoder.
  • Cache prompt verbatim: 매 1-char diff = 매 cache miss. Use semantic cache.
  • Mixing OLTP + vector: 매 single Postgres 의 both = 매 index bloat. Separate.

🧪 검증 / 중복

  • Verified (Pinecone docs, Qdrant docs, MongoDB Atlas Vector Search 2025, "Designing Data-Intensive Applications", LangChain RAG cookbook).
  • 신뢰도 A.

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — NoSQL families mapped to AI/RAG/agent workloads