Files
2nd/10_Wiki/Topics/AI_and_ML/Domain-Specific-Languages.md
T
koriweb d8a80f6272 chore(wiki): dangling 링크 canonical 정규화 (768파일/1200건)
이름만 다른(표기 변형) [[위키링크]]를 대상 문서의 canonical 제목으로 치환해
끊겼던 1,200개 링크를 연결. 제목/파일명 정규화 일치만 적용하고 별칭 매칭은
과병합 위험으로 제외(애매성 가드). 원본은 _link_reconcile_backup/ 에 백업.
도구: Datacollect/scripts/link_reconcile_apply.mjs

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 12:24:15 +09:00

9.0 KiB

id, title, category, status, canonical_id, aliases, duplicate_of, source_trust_level, confidence_score, verification_status, tags, raw_sources, last_reinforced, github_commit, tech_stack
id title category status canonical_id aliases duplicate_of source_trust_level confidence_score verification_status tags raw_sources last_reinforced github_commit tech_stack
wiki-2026-0508-dsl Domain-Specific Languages (DSL) 10_Wiki/Topics verified self
DSL
internal DSL
external DSL
fluent API
embedded DSL
NL-to-DSL
none A 0.9 applied
dsl
programming-languages
abstraction
fluent-api
ddd
sql
regex
llm-dsl
2026-05-10 pending
language framework
any ANTLR / PEG / Tree-sitter

DSL (Domain-Specific Language)

매 한 줄

"매 specific domain 의 optimized 의 language". 매 SQL, regex, HTML, CSS 의 omnipresent. 매 internal (embedded) vs external. 매 modern: 매 LLM 의 NL → DSL 의 best interface. 매 IaC 의 Terraform / config DSL 의 boom.

매 핵심

매 type

External DSL

  • 매 own grammar.
  • 매 parser / compiler 필요.
  • 예: SQL, regex, CSS, GraphQL, Gherkin, HCL (Terraform).

Internal DSL (Embedded)

  • 매 host language 의 문법 안.
  • 매 fluent API.
  • 예: jQuery, RSpec, Gradle (Kotlin DSL), zod schema.

매 design principle

  1. Domain-aligned terminology.
  2. Declarative over imperative ("what" not "how").
  3. Composable.
  4. Readable by domain expert.
  5. Constrained (less Turing-complete = better verification).

매 famous DSL

  • SQL: 매 query.
  • Regex: 매 pattern.
  • CSS: 매 styling.
  • HTML / JSX: 매 markup.
  • GraphQL: 매 query.
  • Cypher (Neo4j): 매 graph.
  • HCL (Terraform): 매 IaC.
  • Gherkin: 매 BDD.
  • Mermaid / PlantUML: 매 diagram.
  • dbt SQL: 매 transformation.

매 modern (LLM era)

NL-to-DSL

  • 매 natural language → 매 SQL / regex / etc.
  • 매 GitHub Copilot regex helper.
  • 매 Text2SQL 의 mature.

LLM-driven DSL

  • 매 LLM 의 better understand 매 specific DSL.
  • 매 internal company DSL 의 valuable.

매 tooling

  • ANTLR: 매 parser generator.
  • PEG.js / Peggy: 매 PEG-based.
  • Tree-sitter: 매 modern incremental.
  • Lark (Python).
  • MPS (JetBrains): 매 projectional.

매 응용

  1. Configuration (Helm, Terraform).
  2. Workflow (Airflow, GitHub Actions YAML).
  3. Test specification (Gherkin, RSpec).
  4. Build (Gradle, Bazel).
  5. Domain modeling (DDD ubiquitous language).
  6. Hardware (Verilog, VHDL).
  7. Math / AI (TLA+, Lean).

💻 패턴

Internal DSL (TypeScript fluent)

// 매 query builder
class QueryBuilder<T> {
  private filters: Filter[] = [];
  private sort: Sort[] = [];
  private limitN = 100;
  
  where(field: keyof T, op: Op, value: unknown): this {
    this.filters.push({ field, op, value });
    return this;  // 매 chain
  }
  
  orderBy(field: keyof T, dir: 'asc' | 'desc'): this {
    this.sort.push({ field, dir });
    return this;
  }
  
  limit(n: number): this {
    this.limitN = n;
    return this;
  }
  
  build(): SQL {
    return `SELECT * FROM ${this.table} WHERE ${this.filters} ORDER BY ${this.sort} LIMIT ${this.limitN}`;
  }
}

// 매 use
const query = new QueryBuilder<User>()
  .where('age', '>', 18)
  .where('country', '=', 'US')
  .orderBy('createdAt', 'desc')
  .limit(50);

Zod schema (DSL-like)

import { z } from 'zod';

const UserSchema = z.object({
  id: z.string().uuid(),
  email: z.string().email(),
  age: z.number().int().min(0).max(150),
  role: z.enum(['admin', 'user']),
  preferences: z.object({
    theme: z.enum(['light', 'dark']).default('light'),
    notifications: z.boolean().default(true),
  }).optional(),
});

type User = z.infer<typeof UserSchema>;

External DSL (Lark in Python)

from lark import Lark, Transformer

grammar = """
start: rule+
rule: NAME ":" condition "→" action
condition: "if" CNAME comparison NUMBER
comparison: ">" | "<" | "=="
action: CNAME "(" CNAME ")"
NUMBER: /\\d+/
%import common.CNAME
%import common.NAME
%import common.WS
%ignore WS
"""

dsl = """
high_score: if score > 90 → notify(student)
warn: if attendance < 70 → email(student)
"""

parser = Lark(grammar)
tree = parser.parse(dsl)

class MyDSL(Transformer):
    def rule(self, items):
        name, cond, action = items
        return {'name': name, 'condition': cond, 'action': action}

print(MyDSL().transform(tree))

NL → SQL (LLM-driven)

def nl_to_sql(query, schema):
    prompt = f"""You are a SQL expert.

Schema:
{schema}

Convert this natural language to SQL:
"{query}"

Return ONLY the SQL, no explanation."""
    sql = llm.generate(prompt)
    
    # 매 validate
    if not is_safe(sql) or has_destructive(sql):
        raise ValueError('Unsafe SQL')
    
    return sql

Gherkin (BDD)

Feature: User login

  Scenario: Successful login with valid credentials
    Given a registered user with email "test@example.com"
    And the user's password is "password123"
    When the user submits the login form
    Then they are redirected to "/dashboard"
    And a session cookie is set
from behave import given, when, then

@given('a registered user with email "{email}"')
def step_user(context, email):
    context.user = create_user(email)

@when('the user submits the login form')
def step_submit(context):
    context.response = client.post('/login', {...})

@then('they are redirected to "{path}"')
def step_redirect(context, path):
    assert context.response.url.endswith(path)

Terraform HCL

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
  
  tags = merge(var.common_tags, {
    Name = "web-${var.environment}"
  })
  
  lifecycle {
    create_before_destroy = true
    prevent_destroy = var.environment == "prod"
  }
}

Custom config DSL (Python)

class PipelineDSL:
    def __init__(self):
        self.steps = []
    
    def fetch(self, source: str):
        self.steps.append(('fetch', source))
        return self
    
    def transform(self, fn):
        self.steps.append(('transform', fn))
        return self
    
    def store(self, dest: str):
        self.steps.append(('store', dest))
        return self
    
    def run(self):
        # 매 execute
        ...

# 매 use
pipeline = (PipelineDSL()
    .fetch('s3://my-bucket/data.csv')
    .transform(lambda df: df.dropna())
    .store('postgres://localhost/clean'))

Tree-sitter (parse)

const Parser = require('tree-sitter');
const SQL = require('tree-sitter-sql');

const parser = new Parser();
parser.setLanguage(SQL);

const tree = parser.parse('SELECT * FROM users WHERE id = 1');
console.log(tree.rootNode.toString());

Validate DSL (LLM-friendly)

def validate_dsl(text, grammar):
    """매 LLM 의 generated DSL 의 validate."""
    try:
        parser.parse(text)
        return {'valid': True}
    except ParseError as e:
        # 매 LLM 의 fix 의 ask
        fixed = llm.generate(f'Fix this DSL parse error:\n{e}\n{text}')
        return {'valid': False, 'fixed_attempt': fixed}

DSL 의 evaluation

class CompiledDSL:
    def __init__(self, ast):
        self.ast = ast
    
    def execute(self, context):
        return self._eval(self.ast, context)
    
    def _eval(self, node, ctx):
        if node.type == 'literal': return node.value
        if node.type == 'variable': return ctx[node.name]
        if node.type == 'binary_op':
            left = self._eval(node.left, ctx)
            right = self._eval(node.right, ctx)
            return apply_op(node.op, left, right)

매 결정 기준

상황 DSL Type
Domain expert reads External DSL
Programmer power Internal (fluent)
Configuration YAML / HCL / TOML
Build Make / Gradle / Bazel
Test BDD Gherkin
LLM interface NL → DSL
Constrained safety External w/ verifier

기본값: 매 internal DSL (TypeScript / Python fluent) for code-heavy. 매 external for non-dev.

🔗 Graph

🤖 LLM 활용

언제: 매 internal API design. 매 NL-to-DSL system. 매 IaC. 매 BDD. 언제 X: 매 simple imperative task.

안티패턴

  • DSL 의 Turing-complete 의 force: 매 verification lose.
  • No documentation / examples: 매 adoption fail.
  • Premature DSL (small project): 매 over-engineering.
  • NL → DSL 의 unvalidated: 매 destructive (e.g., DROP TABLE).
  • DSL leak host language: 매 loose abstraction.

🧪 검증 / 중복

🕓 Changelog

날짜 변경
2026-05-08 Phase 1
2026-05-10 Manual cleanup — internal/external + 매 zod / Lark / Gherkin / NL-to-SQL / Terraform code