--- id: wiki-2026-0508-nosql-databases-in-ai title: NoSQL Databases in AI category: 10_Wiki/Topics status: verified canonical_id: self aliases: [NoSQL for AI, Vector DB, Document Store AI] duplicate_of: none source_trust_level: A confidence_score: 0.9 verification_status: applied tags: [nosql, ai, vector-db, mongodb, redis, rag] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: mongodb-redis-pinecone --- # NoSQL Databases in AI ## 매 한 줄 > **"매 AI workload = embedding + metadata + cache; 매 NoSQL 의 each layer 의 fit"**. 2026 RAG / agent stack 의 매 standard: vector DB (Pinecone/Qdrant/pgvector) + document store (MongoDB) + KV cache (Redis). 매 schema flexibility + horizontal scale 의 LLM-era natural fit. ## 매 핵심 ### 매 NoSQL family + AI role - **Vector**: embedding similarity (RAG, recommendation). Pinecone, Qdrant, Weaviate, Milvus. - **Document**: chat history, agent state, structured output. MongoDB, CouchDB. - **KV / Cache**: prompt cache, semantic cache, session. Redis, DragonflyDB. - **Graph**: knowledge graph, entity link. Neo4j, ArangoDB. - **Wide-column**: time-series telemetry, traces. Cassandra, ScyllaDB. ### 매 access pattern - **Embed + ANN search**: HNSW / IVF index, top-k cosine. - **Metadata filter + vector**: hybrid search. - **TTL cache**: prompt → response 의 24h cache. - **Append-only chat log**: doc store + per-user shard. ### 매 응용 1. RAG: vector + document hybrid. 2. Agent memory: document (short) + vector (long-term). 3. Personalization: KV (recent) + graph (relations). ## 💻 패턴 ### Pinecone (managed vector) ```python from pinecone import Pinecone, ServerlessSpec pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"]) pc.create_index( name="docs", dimension=1536, metric="cosine", spec=ServerlessSpec(cloud="aws", region="us-east-1"), ) idx = pc.Index("docs") idx.upsert(vectors=[ {"id": "doc1", "values": embed("Hello world"), "metadata": {"source": "intro.md", "section": "overview"}}, ]) res = idx.query( vector=embed("greeting example"), top_k=5, filter={"source": {"$eq": "intro.md"}}, include_metadata=True, ) ``` ### Qdrant (self-host) ```python from qdrant_client import QdrantClient from qdrant_client.models import VectorParams, Distance, PointStruct client = QdrantClient("localhost", port=6333) client.recreate_collection( "docs", vectors_config=VectorParams(size=1536, distance=Distance.COSINE), ) client.upsert("docs", points=[ PointStruct(id=1, vector=embed(text), payload={"text": text, "source": "x.md"}), ]) hits = client.search("docs", query_vector=embed(q), limit=5, query_filter={"must": [{"key": "source", "match": {"value": "x.md"}}]}) ``` ### pgvector (Postgres + vector) ```sql CREATE EXTENSION vector; CREATE TABLE docs ( id BIGSERIAL PRIMARY KEY, text TEXT, source TEXT, embedding vector(1536) ); CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops); -- hybrid query SELECT id, text FROM docs WHERE source = 'intro.md' ORDER BY embedding <=> $1::vector LIMIT 5; ``` ### MongoDB Atlas Vector Search ```javascript import { MongoClient } from "mongodb"; const col = new MongoClient(uri).db("ai").collection("docs"); await col.aggregate([ { $vectorSearch: { index: "doc_embedding", path: "embedding", queryVector: await embed(q), numCandidates: 100, limit: 5, filter: { source: "intro.md" }, }}, { $project: { text: 1, score: { $meta: "vectorSearchScore" } } }, ]).toArray(); ``` ### Redis semantic cache ```python import redis, hashlib, json r = redis.Redis() def cached_completion(prompt: str, ttl=3600): key = "ai:" + hashlib.sha256(prompt.encode()).hexdigest() if v := r.get(key): return json.loads(v) out = anthropic.messages.create( model="claude-opus-4-7", messages=[{"role":"user","content":prompt}], max_tokens=1024, ) r.setex(key, ttl, json.dumps({"text": out.content[0].text})) return {"text": out.content[0].text} ``` ### Redis Vector (semantic cache, near-match) ```python # RediSearch + HNSW from redis.commands.search.field import VectorField, TextField from redis.commands.search.indexDefinition import IndexDefinition r.ft("ai_cache").create_index( [TextField("prompt"), VectorField("v", "HNSW", {"TYPE":"FLOAT32","DIM":1536,"DISTANCE_METRIC":"COSINE"})], definition=IndexDefinition(prefix=["cache:"]), ) ``` ### Agent state (MongoDB) ```python from pymongo import MongoClient c = MongoClient(uri).agents.runs run_id = c.insert_one({ "user_id": "u1", "messages": [{"role":"system","content":"..."}], "tool_calls": [], "status": "running", "created_at": datetime.utcnow(), }).inserted_id c.update_one({"_id": run_id}, {"$push": {"messages": new_msg}}) ``` ### Knowledge graph (Neo4j) ```cypher MERGE (a:Person {name: 'Alice'}) MERGE (c:Company {name: 'Acme'}) MERGE (a)-[:WORKS_AT {since: 2020}]->(c) ``` ```python # answer "who at Acme works with Alice's manager?" session.run(""" MATCH (a:Person {name: $n})-[:WORKS_AT]->(c)<-[:WORKS_AT]-(p) RETURN p.name LIMIT 10 """, n="Alice") ``` ### Hybrid retrieval (BM25 + vector) ```python keyword_hits = es.search(index="docs", query={"match": {"text": q}}) vector_hits = idx.query(vector=embed(q), top_k=10).matches fused = reciprocal_rank_fusion([keyword_hits, vector_hits], k=60) ``` ## 매 결정 기준 | 상황 | Approach | |---|---| | Small RAG (<1M docs) | pgvector (single Postgres) | | Medium RAG (1M-100M) | Qdrant / Weaviate self-host | | Large / managed | Pinecone / MongoDB Atlas | | Agent state + chat | MongoDB document store | | Prompt cache | Redis (exact + semantic) | | Entity reasoning | Neo4j | **기본값**: pgvector (start) → Qdrant (scale) + Redis cache + MongoDB for agent state. ## 🔗 Graph - 응용: [[RAG]] · [[Semantic Search|Semantic-Search]] · [[Agent-Memory]] - Adjacent: [[Embeddings]] · [[Hybrid-Search]] · [[pgvector]] ## 🤖 LLM 활용 **언제**: 매 schema design 의 propose, query construction 의 boilerplate, hybrid-search blend 의 tune. **언제 X**: 매 capacity / cost projection, ANN index parameter (M, efConstruction) tuning — measure on real workload. ## ❌ 안티패턴 - **Vector DB only**: 매 metadata filter 의 ignore = 매 irrelevant top-k. - **No re-ranker**: top-50 vector hits 의 직접 LLM 의 feed = noise. Cohere Rerank or cross-encoder. - **Cache prompt verbatim**: 매 1-char diff = 매 cache miss. Use semantic cache. - **Mixing OLTP + vector**: 매 single Postgres 의 both = 매 index bloat. Separate. ## 🧪 검증 / 중복 - Verified (Pinecone docs, Qdrant docs, MongoDB Atlas Vector Search 2025, "Designing Data-Intensive Applications", LangChain RAG cookbook). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — NoSQL families mapped to AI/RAG/agent workloads |