[G1-Sync] Manual knowledge update
This commit is contained in:
@@ -0,0 +1,373 @@
|
||||
---
|
||||
id: devops-k8s-operators
|
||||
title: Kubernetes Operators — CRD + Controller
|
||||
category: Coding
|
||||
status: draft
|
||||
source_trust_level: B
|
||||
verification_status: conceptual
|
||||
created_at: 2026-05-09
|
||||
updated_at: 2026-05-09
|
||||
tags: [devops, kubernetes, operator, vibe-coding]
|
||||
tech_stack: { language: "Go / YAML", applicable_to: ["DevOps"] }
|
||||
applied_in: []
|
||||
aliases: [K8s operator, CRD, custom resource, controller, kubebuilder, operator-sdk, reconcile]
|
||||
---
|
||||
|
||||
# Kubernetes Operators
|
||||
|
||||
> "Application 의 lifecycle 가 K8s native". **CRD (Custom Resource) + Controller (reconcile loop)**. Postgres / Kafka / Redis 가 자체 operator. kubebuilder / operator-sdk.
|
||||
|
||||
## 📖 핵심 개념
|
||||
- CRD: 새 K8s resource type.
|
||||
- Controller: actual state → desired state.
|
||||
- Reconcile loop: 매 변경 시 watch + react.
|
||||
- Level-triggered (not edge).
|
||||
|
||||
## 💻 코드 패턴
|
||||
|
||||
### 일반 K8s
|
||||
```yaml
|
||||
# Deployment, Service, Pod, ConfigMap, ...
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata: { name: app }
|
||||
spec:
|
||||
replicas: 3
|
||||
template: ...
|
||||
```
|
||||
|
||||
### CRD (Custom Resource Definition)
|
||||
```yaml
|
||||
apiVersion: apiextensions.k8s.io/v1
|
||||
kind: CustomResourceDefinition
|
||||
metadata:
|
||||
name: postgresclusters.acid.zalan.do
|
||||
spec:
|
||||
group: acid.zalan.do
|
||||
versions:
|
||||
- name: v1
|
||||
served: true
|
||||
storage: true
|
||||
schema:
|
||||
openAPIV3Schema:
|
||||
type: object
|
||||
properties:
|
||||
spec:
|
||||
type: object
|
||||
properties:
|
||||
version: { type: string }
|
||||
replicas: { type: integer }
|
||||
volumeSize: { type: string }
|
||||
scope: Namespaced
|
||||
names:
|
||||
plural: postgresclusters
|
||||
singular: postgrescluster
|
||||
kind: PostgresCluster
|
||||
```
|
||||
|
||||
### Custom Resource (CR)
|
||||
```yaml
|
||||
apiVersion: acid.zalan.do/v1
|
||||
kind: PostgresCluster
|
||||
metadata:
|
||||
name: my-db
|
||||
spec:
|
||||
version: "16"
|
||||
replicas: 3
|
||||
volumeSize: 50Gi
|
||||
```
|
||||
|
||||
→ K8s resource 처럼. Operator 가 reconcile.
|
||||
|
||||
### Operator (Go)
|
||||
```go
|
||||
// kubebuilder 가 generate
|
||||
type PostgresClusterReconciler struct {
|
||||
client.Client
|
||||
}
|
||||
|
||||
func (r *PostgresClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
|
||||
var cluster acidv1.PostgresCluster
|
||||
if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil {
|
||||
return ctrl.Result{}, client.IgnoreNotFound(err)
|
||||
}
|
||||
|
||||
// Desired state
|
||||
desired := buildStatefulSet(&cluster)
|
||||
|
||||
// Actual
|
||||
var actual appsv1.StatefulSet
|
||||
err := r.Get(ctx, types.NamespacedName{Name: desired.Name, Namespace: desired.Namespace}, &actual)
|
||||
|
||||
if errors.IsNotFound(err) {
|
||||
// Create
|
||||
return ctrl.Result{}, r.Create(ctx, desired)
|
||||
}
|
||||
|
||||
// Update if drift
|
||||
if !equal(&actual, desired) {
|
||||
actual.Spec = desired.Spec
|
||||
return ctrl.Result{}, r.Update(ctx, &actual)
|
||||
}
|
||||
|
||||
return ctrl.Result{RequeueAfter: time.Minute}, nil
|
||||
}
|
||||
```
|
||||
|
||||
→ 매 변경 시 actual 가 desired 와 같게.
|
||||
|
||||
### kubebuilder
|
||||
```bash
|
||||
kubebuilder init --domain example.com
|
||||
kubebuilder create api --group apps --version v1alpha1 --kind App
|
||||
# → CRD + controller 생성
|
||||
|
||||
# Edit, build, deploy
|
||||
make manifests
|
||||
make install
|
||||
make run
|
||||
```
|
||||
|
||||
### operator-sdk
|
||||
```bash
|
||||
operator-sdk init --domain example.com
|
||||
operator-sdk create api --group apps --version v1alpha1 --kind App
|
||||
```
|
||||
|
||||
→ kubebuilder 와 비슷.
|
||||
|
||||
### Reconcile pattern
|
||||
```go
|
||||
func Reconcile(req) (Result, error) {
|
||||
obj := getObject(req)
|
||||
|
||||
// Finalizer (deletion handling)
|
||||
if !obj.DeletionTimestamp.IsZero() {
|
||||
return handleDeletion(obj)
|
||||
}
|
||||
|
||||
// Add finalizer
|
||||
if !containsString(obj.Finalizers, finalizer) {
|
||||
obj.Finalizers = append(obj.Finalizers, finalizer)
|
||||
return ctrl.Result{}, r.Update(ctx, obj)
|
||||
}
|
||||
|
||||
// Reconcile children
|
||||
if err := reconcileService(obj); err != nil { ... }
|
||||
if err := reconcileDeployment(obj); err != nil { ... }
|
||||
if err := reconcileConfigMap(obj); err != nil { ... }
|
||||
|
||||
// Update status
|
||||
obj.Status.Phase = "Running"
|
||||
return ctrl.Result{}, r.Status().Update(ctx, obj)
|
||||
}
|
||||
```
|
||||
|
||||
### Status subresource
|
||||
```go
|
||||
type AppStatus struct {
|
||||
Phase string `json:"phase"`
|
||||
Replicas int32 `json:"replicas"`
|
||||
Conditions []Condition `json:"conditions"`
|
||||
}
|
||||
|
||||
// Spec 변경 = user 의 desired.
|
||||
// Status 변경 = controller.
|
||||
```
|
||||
|
||||
### Owner reference (cleanup)
|
||||
```go
|
||||
deployment.OwnerReferences = []metav1.OwnerReference{
|
||||
{APIVersion: "apps.example.com/v1", Kind: "App", Name: app.Name, UID: app.UID, Controller: ptr(true)},
|
||||
}
|
||||
```
|
||||
|
||||
→ App 삭제 = deployment 자동 삭제 (cascade).
|
||||
|
||||
### Webhook (validation)
|
||||
```go
|
||||
func (r *App) ValidateCreate() error {
|
||||
if r.Spec.Replicas < 1 {
|
||||
return fmt.Errorf("replicas must be >= 1")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
→ kubectl create 시 검증.
|
||||
|
||||
### Mutation webhook
|
||||
```go
|
||||
func (r *App) Default() {
|
||||
if r.Spec.Image == "" {
|
||||
r.Spec.Image = "default:latest"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
→ Default 값 자동 채움.
|
||||
|
||||
### Watch
|
||||
```go
|
||||
return ctrl.NewControllerManagedBy(mgr).
|
||||
For(&appsv1alpha1.App{}).
|
||||
Owns(&appsv1.Deployment{}).
|
||||
Owns(&corev1.Service{}).
|
||||
Complete(r)
|
||||
```
|
||||
|
||||
→ App / 자식 Deployment / Service 변경 시 reconcile.
|
||||
|
||||
### Real-world operators
|
||||
```
|
||||
- prometheus-operator: Prometheus 자동 deploy + config
|
||||
- cert-manager: TLS cert 자동 (Let's Encrypt)
|
||||
- postgres-operator (Zalando, Crunchy)
|
||||
- strimzi-kafka-operator
|
||||
- istio-operator
|
||||
- argocd-operator
|
||||
- velero (backup)
|
||||
- external-secrets-operator
|
||||
```
|
||||
|
||||
### Operator vs Helm
|
||||
```
|
||||
Helm:
|
||||
- Templating (1 deploy, then static)
|
||||
- 복잡 변경 = manual
|
||||
|
||||
Operator:
|
||||
- Continuous reconcile
|
||||
- Self-healing
|
||||
- Domain-specific logic
|
||||
|
||||
→ Stateful (DB, message queue) = operator.
|
||||
Stateless app = Helm.
|
||||
```
|
||||
|
||||
### Helm + Operator 둘 다
|
||||
```
|
||||
Operator 자체 가 Helm chart 로 install.
|
||||
- helm install postgres-operator ...
|
||||
- 그 후 user 가 PostgresCluster CR 만 작성.
|
||||
```
|
||||
|
||||
### Levels (capability)
|
||||
```
|
||||
Level 1: Basic install
|
||||
Level 2: Seamless upgrade
|
||||
Level 3: Full lifecycle (backup, restore)
|
||||
Level 4: Deep insights (metric, alert)
|
||||
Level 5: Auto pilot (auto-heal, auto-scale, auto-tune)
|
||||
```
|
||||
|
||||
→ Mature operator 가 Level 4-5.
|
||||
|
||||
### OperatorHub
|
||||
```
|
||||
operatorhub.io
|
||||
- Catalog of operator
|
||||
- 1-click install
|
||||
- OLM (Operator Lifecycle Manager) 가 관리
|
||||
```
|
||||
|
||||
### Crossplane (operator 식 IaC)
|
||||
```yaml
|
||||
apiVersion: database.aws.crossplane.io/v1beta1
|
||||
kind: RDSInstance
|
||||
metadata:
|
||||
name: my-db
|
||||
spec:
|
||||
forProvider:
|
||||
region: us-east-1
|
||||
dbInstanceClass: db.t3.micro
|
||||
engine: postgres
|
||||
masterUsername: admin
|
||||
```
|
||||
|
||||
→ AWS 의 resource 가 K8s CR. Terraform 의 alternative.
|
||||
|
||||
### KEDA (event-driven autoscaling)
|
||||
```yaml
|
||||
apiVersion: keda.sh/v1alpha1
|
||||
kind: ScaledObject
|
||||
metadata:
|
||||
name: kafka-consumer
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
name: consumer-deployment
|
||||
minReplicaCount: 0
|
||||
maxReplicaCount: 100
|
||||
triggers:
|
||||
- type: kafka
|
||||
metadata:
|
||||
topic: my-topic
|
||||
consumerGroup: my-group
|
||||
lagThreshold: "10"
|
||||
```
|
||||
|
||||
→ Kafka lag → consumer scale.
|
||||
|
||||
### When 작성 own operator?
|
||||
```
|
||||
✓ 도메인 특화 (자체 product 가 K8s native).
|
||||
✓ 복잡 lifecycle (DB, message queue).
|
||||
✓ Auto-heal / auto-scale.
|
||||
|
||||
✗ 단순 deploy (Helm 충분).
|
||||
✗ Pre-built operator 가 충분 (cert-manager 등).
|
||||
```
|
||||
|
||||
### Test
|
||||
```go
|
||||
// envtest (kubebuilder)
|
||||
func TestReconcile(t *testing.T) {
|
||||
env := envtest.Environment{...}
|
||||
cfg, _ := env.Start()
|
||||
defer env.Stop()
|
||||
|
||||
// Create CR, check children
|
||||
}
|
||||
```
|
||||
|
||||
### Production tips
|
||||
```
|
||||
- Idempotent reconcile (재시도 OK).
|
||||
- Status.Conditions 가 user-friendly.
|
||||
- Finalizer 가 cleanup.
|
||||
- Owner reference 가 cascade.
|
||||
- Resource limits (operator 자체).
|
||||
- Leader election (HA).
|
||||
- Metric / log.
|
||||
```
|
||||
|
||||
## 🤔 의사결정 기준
|
||||
| 상황 | 추천 |
|
||||
|---|---|
|
||||
| Stateful complex (DB) | Operator |
|
||||
| 단순 deploy | Helm |
|
||||
| Cloud resource | Crossplane |
|
||||
| Event-driven scale | KEDA |
|
||||
| Off-the-shelf | OperatorHub |
|
||||
| Custom domain | Build own (kubebuilder) |
|
||||
| Backup / restore | Velero (operator) |
|
||||
|
||||
## ❌ 안티패턴
|
||||
- **Reconcile 가 idempotent X**: state corruption.
|
||||
- **No finalizer**: cleanup 안 됨.
|
||||
- **Owner reference 없음**: orphan resource.
|
||||
- **Status 업데이트 안 함**: user 가 모름.
|
||||
- **Webhook fail = create block**: 위험 (HA).
|
||||
- **No leader election**: race.
|
||||
- **모든 거 operator**: simple = Helm.
|
||||
|
||||
## 🤖 LLM 활용 힌트
|
||||
- Operator = CRD + controller (reconcile loop).
|
||||
- kubebuilder / operator-sdk 가 boilerplate.
|
||||
- Stateful workload (DB) 가 sweet spot.
|
||||
- Crossplane / KEDA 가 모던.
|
||||
|
||||
## 🔗 관련 문서
|
||||
- [[DevOps_Kubernetes_Basics]]
|
||||
- [[DevOps_Helm_Deep]]
|
||||
- [[DevOps_ArgoCD_GitOps]]
|
||||
Reference in New Issue
Block a user