f8b21af4be
10_Wiki/Topics 대규모 정리: - 오류 캡처/미완성 stub 문서 227개 제거 - 교차폴더 중복 43클러스터 병합 (63파일 → redirect) - 링크명 정규화: 깨진 링크 수정·redirect 직결·개념 매핑 ~2,400건 - 카테고리 MOC 6개 신규 생성 - Graph 섹션 미해결 related-keyword 링크 10,058건 제거 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
232 lines
6.6 KiB
Markdown
232 lines
6.6 KiB
Markdown
---
|
|
id: wiki-2026-0508-excessive-agency
|
|
title: Excessive Agency (LLM)
|
|
category: 10_Wiki/Topics
|
|
status: verified
|
|
canonical_id: self
|
|
aliases: [excessive agency, OWASP LLM06, agent over-permission, tool abuse, autonomous risk]
|
|
duplicate_of: none
|
|
source_trust_level: A
|
|
confidence_score: 0.92
|
|
verification_status: applied
|
|
tags: [llm-security, owasp, excessive-agency, agent-safety, tool-use, permission]
|
|
raw_sources: []
|
|
last_reinforced: 2026-05-10
|
|
github_commit: pending
|
|
tech_stack:
|
|
language: Python / TypeScript
|
|
framework: LangChain / LlamaIndex / Custom Agent
|
|
---
|
|
|
|
# Excessive Agency
|
|
|
|
## 매 한 줄
|
|
> **"매 LLM agent 의 의 의 too 의 permission / function / autonomy"**. OWASP LLM Top 10 (LLM06). 매 manipulated → 매 destructive action. 매 mitigation: 매 least privilege + 매 human-in-the-loop + 매 sandboxed tools.
|
|
|
|
## 매 핵심
|
|
|
|
### 매 sub-type
|
|
- **Excessive functionality**: 매 too many tools.
|
|
- **Excessive permission**: 매 broad access.
|
|
- **Excessive autonomy**: 매 no HITL.
|
|
|
|
### 매 attack scenario
|
|
- **Prompt injection** → 매 tool 의 abuse.
|
|
- **Indirect** (web page) → 매 fetched 의 inject.
|
|
- **Cross-tool**: 매 read DB → 매 send email.
|
|
- **Agent escalation**: 매 self-empower.
|
|
|
|
### 매 응용 risk
|
|
1. **Email agent**: 매 send to anyone.
|
|
2. **DB agent**: 매 DROP TABLE.
|
|
3. **Browser agent**: 매 sensitive site visit.
|
|
4. **Code agent**: 매 git push --force.
|
|
5. **Multi-agent**: 매 manager prompt-inject worker.
|
|
|
|
### 매 mitigation
|
|
- **Least privilege**.
|
|
- **Read-only by default**.
|
|
- **HITL for destructive**.
|
|
- **Sandbox / capability**.
|
|
- **Rate limit**.
|
|
- **Audit log**.
|
|
|
|
## 💻 패턴
|
|
|
|
### Tool whitelist (least privilege)
|
|
```python
|
|
def safe_agent_tools(user_role):
|
|
base = ['search', 'read_doc']
|
|
if user_role == 'admin':
|
|
base += ['write_doc', 'send_email']
|
|
if user_role == 'super_admin':
|
|
base += ['delete_doc']
|
|
return base
|
|
|
|
agent = create_agent(tools=safe_agent_tools(current_user.role))
|
|
```
|
|
|
|
### HITL approval
|
|
```python
|
|
def execute_tool(tool, args):
|
|
if tool.is_destructive:
|
|
approval = request_human_approval(f"Approve {tool.name} with {args}?")
|
|
if not approval: return {'status': 'denied'}
|
|
return tool.run(**args)
|
|
```
|
|
|
|
### Sandboxed file write
|
|
```python
|
|
import os
|
|
SANDBOX = '/tmp/agent-sandbox'
|
|
os.makedirs(SANDBOX, exist_ok=True)
|
|
|
|
def safe_write(filename, content):
|
|
full = os.path.realpath(os.path.join(SANDBOX, filename))
|
|
if not full.startswith(SANDBOX):
|
|
raise SecurityError('Outside sandbox')
|
|
open(full, 'w').write(content)
|
|
```
|
|
|
|
### Read-only DB role
|
|
```sql
|
|
CREATE ROLE agent_readonly;
|
|
GRANT SELECT ON ALL TABLES IN SCHEMA public TO agent_readonly;
|
|
-- 매 NO insert / update / delete
|
|
```
|
|
|
|
### Capability token
|
|
```python
|
|
class Capability:
|
|
def __init__(self, action, resource, expires_in=300):
|
|
self.token = jwt.encode({'action': action, 'resource': resource, 'exp': time() + expires_in}, KEY)
|
|
|
|
def use(self, action, resource):
|
|
claims = jwt.decode(self.token, KEY)
|
|
assert claims['action'] == action and claims['resource'] == resource
|
|
return True
|
|
|
|
# 매 agent 의 cap 의 의 use, 매 broader 의 X
|
|
```
|
|
|
|
### Rate limit + budget
|
|
```python
|
|
class AgentBudget:
|
|
def __init__(self, max_calls=50, max_cost_usd=1.0):
|
|
self.calls = 0; self.cost = 0
|
|
self.max_calls = max_calls; self.max_cost = max_cost_usd
|
|
|
|
def check(self, estimated_cost):
|
|
if self.calls >= self.max_calls: raise LimitError('Call limit')
|
|
if self.cost + estimated_cost > self.max_cost: raise LimitError('Cost limit')
|
|
|
|
def record(self, cost):
|
|
self.calls += 1; self.cost += cost
|
|
```
|
|
|
|
### Tool taint analysis
|
|
```python
|
|
def detect_tainted_input(text, tool_args):
|
|
"""매 if user input flows to high-impact tool, escalate."""
|
|
if any(t in str(tool_args) for t in TAINT_MARKERS):
|
|
return require_approval(tool, tool_args)
|
|
return False
|
|
```
|
|
|
|
### Indirect injection check
|
|
```python
|
|
def sanitize_external_content(html):
|
|
"""매 web fetched 의 instruction strip."""
|
|
soup = BeautifulSoup(html, 'lxml')
|
|
for tag in soup.find_all(['script', 'iframe']):
|
|
tag.decompose()
|
|
text = soup.get_text()
|
|
# 매 prompt-inject pattern
|
|
if re.search(r'(ignore (previous|all) instructions|new task|system:)', text, re.I):
|
|
return text + "\n\n[NOTE: Suspicious content detected]"
|
|
return text
|
|
```
|
|
|
|
### Multi-agent isolation
|
|
```python
|
|
class IsolatedAgent:
|
|
def __init__(self, role, tools):
|
|
self.role = role
|
|
self.tools = tools # 매 role-specific
|
|
|
|
def receive(self, msg, sender):
|
|
# 매 不 trust other agents' tool requests
|
|
if sender != 'human' and msg.requests_tool_use:
|
|
return 'Agent-to-agent tool requests not allowed'
|
|
```
|
|
|
|
### Audit log
|
|
```python
|
|
def audit_tool_use(tool, args, result, user):
|
|
log({
|
|
'timestamp': now(),
|
|
'user': user,
|
|
'agent_session': current_session.id,
|
|
'tool': tool.name,
|
|
'args': hash_sensitive(args),
|
|
'result_status': result.status,
|
|
'cost_usd': result.cost,
|
|
})
|
|
```
|
|
|
|
### Dry-run mode
|
|
```python
|
|
def dry_run_tool(tool, args):
|
|
"""매 destructive action 의 simulate."""
|
|
plan = tool.plan(**args)
|
|
return f"DRY RUN: would {plan.summary}"
|
|
```
|
|
|
|
### Reversibility check
|
|
```python
|
|
def assess_reversibility(tool, args):
|
|
if tool.action == 'delete' and not args.get('soft'): return 'irreversible'
|
|
if tool.action == 'send_message': return 'visible_to_others'
|
|
if tool.action == 'transfer_money': return 'irreversible'
|
|
return 'safe'
|
|
|
|
# 매 irreversible → HITL
|
|
```
|
|
|
|
## 매 결정 기준
|
|
| 상황 | Mitigation |
|
|
|---|---|
|
|
| Read-only research | Whitelist + audit |
|
|
| Write actions | HITL approval |
|
|
| Destructive | Reversibility + HITL |
|
|
| Multi-agent | Isolation |
|
|
| Public-facing | Sandbox + budget |
|
|
| Sensitive data | Capability token |
|
|
|
|
**기본값**: 매 least privilege tools + 매 HITL for destructive + 매 sandbox + 매 audit + 매 budget cap.
|
|
|
|
## 🔗 Graph
|
|
- 응용: [[Tool-Use]] · [[HITL]]
|
|
- Adjacent: [[Prompt-Injection]] · [[AI_Safety_and_Alignment|Constitutional-AI]]
|
|
|
|
## 🤖 LLM 활용
|
|
**언제**: 매 모든 agent. 매 tool-using LLM. 매 production deploy.
|
|
**언제 X**: 매 sandboxed dev only.
|
|
|
|
## ❌ 안티패턴
|
|
- **All-tools agent**: 매 broad attack surface.
|
|
- **No HITL on destructive**: 매 single mistake.
|
|
- **Implicit trust of fetched**: 매 indirect injection.
|
|
- **No budget**: 매 runaway cost.
|
|
- **No audit**: 매 forensics 의 X.
|
|
|
|
## 🧪 검증 / 중복
|
|
- Verified (OWASP LLM Top 10 2024, Anthropic agent safety).
|
|
- 신뢰도 A.
|
|
|
|
## 🕓 Changelog
|
|
| 날짜 | 변경 |
|
|
|---|---|
|
|
| 2026-05-08 | Phase 1 |
|
|
| 2026-05-10 | Manual cleanup — agency types + 매 whitelist / HITL / sandbox / budget code |
|