--- id: wiki-2026-0508-excessive-agency title: Excessive Agency (LLM) category: 10_Wiki/Topics status: verified canonical_id: self aliases: [excessive agency, OWASP LLM06, agent over-permission, tool abuse, autonomous risk] duplicate_of: none source_trust_level: A confidence_score: 0.92 verification_status: applied tags: [llm-security, owasp, excessive-agency, agent-safety, tool-use, permission] raw_sources: [] last_reinforced: 2026-05-10 github_commit: pending tech_stack: language: Python / TypeScript framework: LangChain / LlamaIndex / Custom Agent --- # Excessive Agency ## 매 한 줄 > **"매 LLM agent 의 의 의 too 의 permission / function / autonomy"**. OWASP LLM Top 10 (LLM06). 매 manipulated → 매 destructive action. 매 mitigation: 매 least privilege + 매 human-in-the-loop + 매 sandboxed tools. ## 매 핵심 ### 매 sub-type - **Excessive functionality**: 매 too many tools. - **Excessive permission**: 매 broad access. - **Excessive autonomy**: 매 no HITL. ### 매 attack scenario - **Prompt injection** → 매 tool 의 abuse. - **Indirect** (web page) → 매 fetched 의 inject. - **Cross-tool**: 매 read DB → 매 send email. - **Agent escalation**: 매 self-empower. ### 매 응용 risk 1. **Email agent**: 매 send to anyone. 2. **DB agent**: 매 DROP TABLE. 3. **Browser agent**: 매 sensitive site visit. 4. **Code agent**: 매 git push --force. 5. **Multi-agent**: 매 manager prompt-inject worker. ### 매 mitigation - **Least privilege**. - **Read-only by default**. - **HITL for destructive**. - **Sandbox / capability**. - **Rate limit**. - **Audit log**. ## 💻 패턴 ### Tool whitelist (least privilege) ```python def safe_agent_tools(user_role): base = ['search', 'read_doc'] if user_role == 'admin': base += ['write_doc', 'send_email'] if user_role == 'super_admin': base += ['delete_doc'] return base agent = create_agent(tools=safe_agent_tools(current_user.role)) ``` ### HITL approval ```python def execute_tool(tool, args): if tool.is_destructive: approval = request_human_approval(f"Approve {tool.name} with {args}?") if not approval: return {'status': 'denied'} return tool.run(**args) ``` ### Sandboxed file write ```python import os SANDBOX = '/tmp/agent-sandbox' os.makedirs(SANDBOX, exist_ok=True) def safe_write(filename, content): full = os.path.realpath(os.path.join(SANDBOX, filename)) if not full.startswith(SANDBOX): raise SecurityError('Outside sandbox') open(full, 'w').write(content) ``` ### Read-only DB role ```sql CREATE ROLE agent_readonly; GRANT SELECT ON ALL TABLES IN SCHEMA public TO agent_readonly; -- 매 NO insert / update / delete ``` ### Capability token ```python class Capability: def __init__(self, action, resource, expires_in=300): self.token = jwt.encode({'action': action, 'resource': resource, 'exp': time() + expires_in}, KEY) def use(self, action, resource): claims = jwt.decode(self.token, KEY) assert claims['action'] == action and claims['resource'] == resource return True # 매 agent 의 cap 의 의 use, 매 broader 의 X ``` ### Rate limit + budget ```python class AgentBudget: def __init__(self, max_calls=50, max_cost_usd=1.0): self.calls = 0; self.cost = 0 self.max_calls = max_calls; self.max_cost = max_cost_usd def check(self, estimated_cost): if self.calls >= self.max_calls: raise LimitError('Call limit') if self.cost + estimated_cost > self.max_cost: raise LimitError('Cost limit') def record(self, cost): self.calls += 1; self.cost += cost ``` ### Tool taint analysis ```python def detect_tainted_input(text, tool_args): """매 if user input flows to high-impact tool, escalate.""" if any(t in str(tool_args) for t in TAINT_MARKERS): return require_approval(tool, tool_args) return False ``` ### Indirect injection check ```python def sanitize_external_content(html): """매 web fetched 의 instruction strip.""" soup = BeautifulSoup(html, 'lxml') for tag in soup.find_all(['script', 'iframe']): tag.decompose() text = soup.get_text() # 매 prompt-inject pattern if re.search(r'(ignore (previous|all) instructions|new task|system:)', text, re.I): return text + "\n\n[NOTE: Suspicious content detected]" return text ``` ### Multi-agent isolation ```python class IsolatedAgent: def __init__(self, role, tools): self.role = role self.tools = tools # 매 role-specific def receive(self, msg, sender): # 매 不 trust other agents' tool requests if sender != 'human' and msg.requests_tool_use: return 'Agent-to-agent tool requests not allowed' ``` ### Audit log ```python def audit_tool_use(tool, args, result, user): log({ 'timestamp': now(), 'user': user, 'agent_session': current_session.id, 'tool': tool.name, 'args': hash_sensitive(args), 'result_status': result.status, 'cost_usd': result.cost, }) ``` ### Dry-run mode ```python def dry_run_tool(tool, args): """매 destructive action 의 simulate.""" plan = tool.plan(**args) return f"DRY RUN: would {plan.summary}" ``` ### Reversibility check ```python def assess_reversibility(tool, args): if tool.action == 'delete' and not args.get('soft'): return 'irreversible' if tool.action == 'send_message': return 'visible_to_others' if tool.action == 'transfer_money': return 'irreversible' return 'safe' # 매 irreversible → HITL ``` ## 매 결정 기준 | 상황 | Mitigation | |---|---| | Read-only research | Whitelist + audit | | Write actions | HITL approval | | Destructive | Reversibility + HITL | | Multi-agent | Isolation | | Public-facing | Sandbox + budget | | Sensitive data | Capability token | **기본값**: 매 least privilege tools + 매 HITL for destructive + 매 sandbox + 매 audit + 매 budget cap. ## 🔗 Graph - 응용: [[Tool-Use]] · [[HITL]] - Adjacent: [[Prompt-Injection]] · [[AI_Safety_and_Alignment|Constitutional-AI]] ## 🤖 LLM 활용 **언제**: 매 모든 agent. 매 tool-using LLM. 매 production deploy. **언제 X**: 매 sandboxed dev only. ## ❌ 안티패턴 - **All-tools agent**: 매 broad attack surface. - **No HITL on destructive**: 매 single mistake. - **Implicit trust of fetched**: 매 indirect injection. - **No budget**: 매 runaway cost. - **No audit**: 매 forensics 의 X. ## 🧪 검증 / 중복 - Verified (OWASP LLM Top 10 2024, Anthropic agent safety). - 신뢰도 A. ## 🕓 Changelog | 날짜 | 변경 | |---|---| | 2026-05-08 | Phase 1 | | 2026-05-10 | Manual cleanup — agency types + 매 whitelist / HITL / sandbox / budget code |