--- id: wiki-2026-0508-superficiality-metrics title: Superficiality Metrics category: 10_Wiki/Topics status: verified canonical_id: self aliases: [Engagement Quality Metrics, Depth Metrics, Content Quality Signals] duplicate_of: none source_trust_level: B confidence_score: 0.85 verification_status: applied tags: [metrics, content-quality, engagement, evaluation] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: python framework: pandas --- # Superficiality Metrics ## 매 한 줄 > **"매 engagement 의 depth 측정"**. CTR / time-on-page 같은 surface metric 만 보면 clickbait 의 reward → 매 deeper signal (scroll completion, return visit, comment quality, downstream conversion) 의 measure 의 필요. 2026 의 LLM-as-judge 의 quality scoring 의 mainstream. ## 매 핵심 ### 매 surface vs depth metric - **Surface**: CTR, time-on-page, bounce rate, like count. - **Depth**: scroll depth, dwell quality (focus events), return visit %, share-with-comment, subscription, downstream action (purchase, signup). ### 매 LLM-judged content quality - **Coherence**: 매 logical flow. - **Substantive density**: 매 facts / claim 단위 의 information. - **Originality**: 매 generic LLM-output 의 detection. - **Actionability**: 매 reader 가 take-away 의 가능성. ### 매 응용 1. Content recommendation ranking (YouTube, TikTok 의 newer signals). 2. Knowledge-base quality gating (Wiki article 의 acceptance). 3. Education platform 의 learning outcome 측정. 4. Newsletter / blog 의 ROI evaluation. ## 💻 패턴 ### Scroll depth tracking ```typescript let maxScroll = 0; window.addEventListener('scroll', () => { const scrollPct = window.scrollY / (document.body.scrollHeight - window.innerHeight); if (scrollPct > maxScroll) maxScroll = scrollPct; }, { passive: true }); window.addEventListener('beforeunload', () => { navigator.sendBeacon('/analytics', JSON.stringify({ page: location.pathname, maxScroll, duration: performance.now() })); }); ``` ### Dwell quality (focus + scroll) ```typescript let focusedTime = 0; let lastFocusStart = document.hasFocus() ? performance.now() : null; document.addEventListener('visibilitychange', () => { if (document.hidden && lastFocusStart != null) { focusedTime += performance.now() - lastFocusStart; lastFocusStart = null; } else if (!document.hidden) { lastFocusStart = performance.now(); } }); ``` ### LLM-as-judge quality score ```python from anthropic import Anthropic client = Anthropic() def score_content(text: str) -> dict: resp = client.messages.create( model="claude-opus-4-7", max_tokens=512, messages=[{ "role": "user", "content": f"""Rate the following article on 4 axes (0-10 each): - coherence (logical flow) - density (info per paragraph) - originality (vs generic LLM output) - actionability (reader takeaway) Return strict JSON: {{"coherence": N, "density": N, "originality": N, "actionability": N, "rationale": "..."}} ARTICLE: {text[:8000]}""" }] ) import json return json.loads(resp.content[0].text) ``` ### Composite depth score ```python import numpy as np def depth_score(metrics: dict) -> float: # weights tuned on labeled training set w = { 'scroll_completion': 0.15, 'focused_dwell_ratio': 0.25, 'return_within_7d': 0.20, 'downstream_action': 0.25, 'share_with_comment': 0.15, } return sum(w[k] * metrics.get(k, 0) for k in w) ``` ### Clickbait detector heuristic ```python def clickbait_signal(row): # high CTR + low depth = clickbait if row['ctr'] > 0.10 and row['depth_score'] < 0.3: return 1.0 return 0.0 ``` ### Pandas pipeline ```python import pandas as pd df = pd.read_parquet('events.parquet') agg = df.groupby('article_id').agg( ctr=('clicks', 'sum') / ('impressions', 'sum'), avg_scroll=('max_scroll', 'mean'), return_rate=('returned_7d', 'mean'), ).assign(depth_score=lambda d: 0.4*d.avg_scroll + 0.6*d.return_rate) ``` ## 매 결정 기준 | 상황 | Metric | |---|---| | 매 ad-supported (need clicks) | CTR + minimal depth floor | | 매 subscription / paid | depth_score primary | | 매 education / learning | actionability + post-test outcome | | 매 knowledge wiki | LLM coherence + density | | 매 social platform | share-with-comment, return visit | **기본값**: 매 composite depth score (50% behavioral + 50% LLM-judged). ## 🔗 Graph - 부모: [[Evaluation]] - Adjacent: [[LLM-as-Judge]] · [[Goodhart_s-Law]] ## 🤖 LLM 활용 **언제**: 매 content recommendation 의 reranking signal, KB article quality gate, AB test 의 secondary metric. **언제 X**: 매 small sample (variance 너무 큼), 매 acquisition-stage funnel (CTR primary). ## ❌ 안티패턴 - **Single metric optimization**: Goodhart — 매 CTR alone optimize 하면 clickbait. - **LLM judge 의 prompt drift**: 매 pinned model + temperature 0 + version log 의 필수. - **Depth metric 의 latency**: return-visit 7d → 매 delayed feedback. 매 surrogate (focused dwell) 도 함께. ## 🧪 검증 / 중복 - Verified (Goodhart 1975; Zheng et al. 2023 LLM-as-judge; YouTube Watch Time → "Valued Watch Time" pivot ~2017). - 신뢰도 B (매 weighting 의 domain-dependent). ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — surface vs depth + LLM judge + composite scoring |