--- id: P-REINFORCE-AUTO-SIME-001 category: "10_Wiki/๐Ÿ’ก Topics/AI" confidence_score: 0.96 tags: [auto-reinforced, mathematics, similarity-metrics, statistics, vector-space] last_reinforced: 2026-04-20 --- # [[Similarity-Metrics|Similarity-Metrics]] ## ๐Ÿ“Œ ํ•œ ์ค„ ํ†ต์ฐฐ (The Karpathy Summary) > "๋ฐ์ดํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ ์ธก์ •๋ฒ•: ์„œ๋กœ ๋‹ค๋ฅธ ๋‘ ์ •๋ณด๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋‹ฎ์•˜๋Š”์ง€๋ฅผ ์ˆ˜ํ•™์ ์œผ๋กœ ์ •์˜ํ•˜์—ฌ, ์ถ”์ฒœ ์‹œ์Šคํ…œ๊ณผ ๊ฒ€์ƒ‰ ์—”์ง„์ด '๋น„์Šทํ•œ ๊ฒƒ'์„ ์ฐพ์•„๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ์ง€๋Šฅ์˜ ์ฒ™๋„." ## ๐Ÿ“– ๊ตฌ์กฐํ™”๋œ ์ง€์‹ (Synthesized Content) ์œ ์‚ฌ๋„ ์ธก์ • ์ง€ํ‘œ(Similarity Metrics)๋Š” ๋ฒกํ„ฐ ๊ณต๊ฐ„์— ํ‘œํ˜„๋œ ๋ฐ์ดํ„ฐ ๊ฐ์ฒด ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋‚˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ์ •๋Ÿ‰ํ™”ํ•˜๋Š” ์ˆ˜ํ•™์  ๋ฐฉ๋ฒ•๋ก ์ž…๋‹ˆ๋‹ค. 1. **ํ•ต์‹ฌ ์ง€ํ‘œ (Core Metrics)**: * **Cosine Similarity**: ๋‘ ๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฐ๋„๋ฅผ ์ธก์ •. ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์ฒ˜๋Ÿผ ํฌ๊ธฐ(Magnitude)๋ณด๋‹ค ๋ฐฉํ–ฅ์„ฑ์ด ์ค‘์š”ํ•  ๋•Œ ์ฃผ๋กœ ์‚ฌ์šฉ. * **Euclidean Distance**: ๊ณต๊ฐ„์ƒ์˜ ์ง์„ ๊ฑฐ๋ฆฌ. ๋ฐ์ดํ„ฐ์˜ ์ ˆ๋Œ€์ ์ธ ๊ฐ’์ด ์ค‘์š”ํ•  ๋•Œ ์‚ฌ์šฉ. * **Manhattan Distance**: ๊ฒฉ์ž ๋ชจ์–‘์˜ ๊ฒฝ๋กœ ๊ฑฐ๋ฆฌ (L1 Norm). * **Jaccard Similarity**: ์ง‘ํ•ฉ ๊ฐ„์˜ ๊ต์ง‘ํ•ฉ ๋น„์ค‘์„ ์ธก์ •. ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ ๋น„๊ต์— ์ ํ•ฉ. 2. **ํ™œ์šฉ ๋ถ„์•ผ**: * **RAG (๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ)**: ์งˆ๋ฌธ๊ณผ ๊ฐ€์žฅ ์œ ์‚ฌํ•œ ์ง€์‹ ์กฐ๊ฐ์„ ๋ฒกํ„ฐ DB์—์„œ ์ฐพ๋Š” ํ•ต์‹ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜. * **Recommender Systems**: ๋‚ด๊ฐ€ ๋ณธ ์˜ํ™”์™€ ๊ฐ€์žฅ '์œ ์‚ฌํ•œ' ์ทจํ–ฅ์˜ ์˜ํ™” ์ถ”์ฒœ. * **Anomaly Detection**: ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๋“ค๊ณผ์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ๋„ˆ๋ฌด ๋จผ '์ด์ƒ์น˜' ์‹๋ณ„. 3. **์„ ํƒ ๊ธฐ์ค€**: * ๋ฐ์ดํ„ฐ์˜ ์ฐจ์›์ˆ˜, ์ •๊ทœํ™” ์—ฌ๋ถ€, ๋น„์ฆˆ๋‹ˆ์Šค ๋ชฉ์ ์— ๋”ฐ๋ผ ์ ์ ˆํ•œ ์ง€ํ‘œ ์„ ํƒ์ด ์‹œ์Šคํ…œ ์„ฑ๋Šฅ์„ ์ขŒ์šฐํ•จ. ## โš ๏ธ ๋ชจ์ˆœ ๋ฐ ์—…๋ฐ์ดํŠธ (Contradictions & RL Update) - **๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€์˜ ์ถฉ๋Œ**: ๊ณผ๊ฑฐ์—๋Š” ๋‹จ์ˆœํ•œ ๊ฑฐ๋ฆฌ ์ธก์ •๋งŒ์œผ๋กœ ์ถฉ๋ถ„ํ–ˆ์œผ๋‚˜, ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ๊ฐ€ ํญ์ฆํ•˜๋ฉฐ '์ฐจ์›์˜ ์ €์ฃผ' ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒ. ์ด์— ๋”ฐ๋ผ ๋‹จ์ˆœํžˆ ๊ฐ€๊น๋‹ค๊ณ  ๋น„์Šทํ•œ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์˜๋ฏธ์ ์œผ๋กœ ์œ ์‚ฌํ•œ์ง€๋ฅผ ํŒŒ์•…ํ•˜๋Š” '์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฐ˜ ์œ ์‚ฌ๋„'๋กœ ์ •์ฑ…์ด ์ด๋™ํ•จ(RL Update). - **์ •์ฑ… ๋ณ€ํ™”(RL Update)**: ๊ฐœ์ธํ™” ์ถ”์ฒœ ์ •์ฑ… ์ˆ˜๋ฆฝ ์‹œ, ๋‹จ์ˆœํžˆ ๊ณผ๊ฑฐ ์œ ์‚ฌ๋„๋งŒ ๋”ฐ์ง€๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์œ ์ €์˜ '์˜๋„ ๋ณ€ํ™”'๋ฅผ ์‹ค์‹œ๊ฐ„ ๋ฐ˜์˜ํ•˜๋Š” ๊ฐ€๋ณ€์  ์œ ์‚ฌ๋„ ๊ฐ€์ค‘์น˜ ์ •์ฑ…์ด ํ‘œ์ค€ํ™”๋จ. ## ๐Ÿ”— ์ง€์‹ ์—ฐ๊ฒฐ (Graph) - Vector Semantics, [[RAG (แ„€แ…ฅแ†ทแ„‰แ…ขแ†จ แ„Œแ…ณแ†ผแ„€แ…กแ†ผ แ„‰แ…ขแ†ผแ„‰แ…ฅแ†ผ)|RAG (๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ)]], [[Statistics & Data Analysis|Statistics & Data Analysis]], Information Extraction (IE), [[Principles-of-Data-Connect|Principles-of-Data-Connect]] - **Modern Tech/Tools**: Faiss (Facebook AI Similarity Search), Scipy Spatial, Pinecone. ---