8.3 KiB
8.3 KiB
id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
| id | title | category | status | source_trust_level | verification_status | created_at | updated_at | tags | tech_stack | applied_in | aliases | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| devops-k8s-operators | Kubernetes Operators — CRD + Controller | Coding | draft | B | conceptual | 2026-05-09 | 2026-05-09 |
|
|
|
Kubernetes Operators
"Application 의 lifecycle 가 K8s native". CRD (Custom Resource) + Controller (reconcile loop). Postgres / Kafka / Redis 가 자체 operator. kubebuilder / operator-sdk.
📖 핵심 개념
- CRD: 새 K8s resource type.
- Controller: actual state → desired state.
- Reconcile loop: 매 변경 시 watch + react.
- Level-triggered (not edge).
💻 코드 패턴
일반 K8s
# Deployment, Service, Pod, ConfigMap, ...
apiVersion: apps/v1
kind: Deployment
metadata: { name: app }
spec:
replicas: 3
template: ...
CRD (Custom Resource Definition)
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: postgresclusters.acid.zalan.do
spec:
group: acid.zalan.do
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
version: { type: string }
replicas: { type: integer }
volumeSize: { type: string }
scope: Namespaced
names:
plural: postgresclusters
singular: postgrescluster
kind: PostgresCluster
Custom Resource (CR)
apiVersion: acid.zalan.do/v1
kind: PostgresCluster
metadata:
name: my-db
spec:
version: "16"
replicas: 3
volumeSize: 50Gi
→ K8s resource 처럼. Operator 가 reconcile.
Operator (Go)
// kubebuilder 가 generate
type PostgresClusterReconciler struct {
client.Client
}
func (r *PostgresClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
var cluster acidv1.PostgresCluster
if err := r.Get(ctx, req.NamespacedName, &cluster); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
// Desired state
desired := buildStatefulSet(&cluster)
// Actual
var actual appsv1.StatefulSet
err := r.Get(ctx, types.NamespacedName{Name: desired.Name, Namespace: desired.Namespace}, &actual)
if errors.IsNotFound(err) {
// Create
return ctrl.Result{}, r.Create(ctx, desired)
}
// Update if drift
if !equal(&actual, desired) {
actual.Spec = desired.Spec
return ctrl.Result{}, r.Update(ctx, &actual)
}
return ctrl.Result{RequeueAfter: time.Minute}, nil
}
→ 매 변경 시 actual 가 desired 와 같게.
kubebuilder
kubebuilder init --domain example.com
kubebuilder create api --group apps --version v1alpha1 --kind App
# → CRD + controller 생성
# Edit, build, deploy
make manifests
make install
make run
operator-sdk
operator-sdk init --domain example.com
operator-sdk create api --group apps --version v1alpha1 --kind App
→ kubebuilder 와 비슷.
Reconcile pattern
func Reconcile(req) (Result, error) {
obj := getObject(req)
// Finalizer (deletion handling)
if !obj.DeletionTimestamp.IsZero() {
return handleDeletion(obj)
}
// Add finalizer
if !containsString(obj.Finalizers, finalizer) {
obj.Finalizers = append(obj.Finalizers, finalizer)
return ctrl.Result{}, r.Update(ctx, obj)
}
// Reconcile children
if err := reconcileService(obj); err != nil { ... }
if err := reconcileDeployment(obj); err != nil { ... }
if err := reconcileConfigMap(obj); err != nil { ... }
// Update status
obj.Status.Phase = "Running"
return ctrl.Result{}, r.Status().Update(ctx, obj)
}
Status subresource
type AppStatus struct {
Phase string `json:"phase"`
Replicas int32 `json:"replicas"`
Conditions []Condition `json:"conditions"`
}
// Spec 변경 = user 의 desired.
// Status 변경 = controller.
Owner reference (cleanup)
deployment.OwnerReferences = []metav1.OwnerReference{
{APIVersion: "apps.example.com/v1", Kind: "App", Name: app.Name, UID: app.UID, Controller: ptr(true)},
}
→ App 삭제 = deployment 자동 삭제 (cascade).
Webhook (validation)
func (r *App) ValidateCreate() error {
if r.Spec.Replicas < 1 {
return fmt.Errorf("replicas must be >= 1")
}
return nil
}
→ kubectl create 시 검증.
Mutation webhook
func (r *App) Default() {
if r.Spec.Image == "" {
r.Spec.Image = "default:latest"
}
}
→ Default 값 자동 채움.
Watch
return ctrl.NewControllerManagedBy(mgr).
For(&appsv1alpha1.App{}).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Service{}).
Complete(r)
→ App / 자식 Deployment / Service 변경 시 reconcile.
Real-world operators
- prometheus-operator: Prometheus 자동 deploy + config
- cert-manager: TLS cert 자동 (Let's Encrypt)
- postgres-operator (Zalando, Crunchy)
- strimzi-kafka-operator
- istio-operator
- argocd-operator
- velero (backup)
- external-secrets-operator
Operator vs Helm
Helm:
- Templating (1 deploy, then static)
- 복잡 변경 = manual
Operator:
- Continuous reconcile
- Self-healing
- Domain-specific logic
→ Stateful (DB, message queue) = operator.
Stateless app = Helm.
Helm + Operator 둘 다
Operator 자체 가 Helm chart 로 install.
- helm install postgres-operator ...
- 그 후 user 가 PostgresCluster CR 만 작성.
Levels (capability)
Level 1: Basic install
Level 2: Seamless upgrade
Level 3: Full lifecycle (backup, restore)
Level 4: Deep insights (metric, alert)
Level 5: Auto pilot (auto-heal, auto-scale, auto-tune)
→ Mature operator 가 Level 4-5.
OperatorHub
operatorhub.io
- Catalog of operator
- 1-click install
- OLM (Operator Lifecycle Manager) 가 관리
Crossplane (operator 식 IaC)
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
metadata:
name: my-db
spec:
forProvider:
region: us-east-1
dbInstanceClass: db.t3.micro
engine: postgres
masterUsername: admin
→ AWS 의 resource 가 K8s CR. Terraform 의 alternative.
KEDA (event-driven autoscaling)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: kafka-consumer
spec:
scaleTargetRef:
name: consumer-deployment
minReplicaCount: 0
maxReplicaCount: 100
triggers:
- type: kafka
metadata:
topic: my-topic
consumerGroup: my-group
lagThreshold: "10"
→ Kafka lag → consumer scale.
When 작성 own operator?
✓ 도메인 특화 (자체 product 가 K8s native).
✓ 복잡 lifecycle (DB, message queue).
✓ Auto-heal / auto-scale.
✗ 단순 deploy (Helm 충분).
✗ Pre-built operator 가 충분 (cert-manager 등).
Test
// envtest (kubebuilder)
func TestReconcile(t *testing.T) {
env := envtest.Environment{...}
cfg, _ := env.Start()
defer env.Stop()
// Create CR, check children
}
Production tips
- Idempotent reconcile (재시도 OK).
- Status.Conditions 가 user-friendly.
- Finalizer 가 cleanup.
- Owner reference 가 cascade.
- Resource limits (operator 자체).
- Leader election (HA).
- Metric / log.
🤔 의사결정 기준
| 상황 | 추천 |
|---|---|
| Stateful complex (DB) | Operator |
| 단순 deploy | Helm |
| Cloud resource | Crossplane |
| Event-driven scale | KEDA |
| Off-the-shelf | OperatorHub |
| Custom domain | Build own (kubebuilder) |
| Backup / restore | Velero (operator) |
❌ 안티패턴
- Reconcile 가 idempotent X: state corruption.
- No finalizer: cleanup 안 됨.
- Owner reference 없음: orphan resource.
- Status 업데이트 안 함: user 가 모름.
- Webhook fail = create block: 위험 (HA).
- No leader election: race.
- 모든 거 operator: simple = Helm.
🤖 LLM 활용 힌트
- Operator = CRD + controller (reconcile loop).
- kubebuilder / operator-sdk 가 boilerplate.
- Stateful workload (DB) 가 sweet spot.
- Crossplane / KEDA 가 모던.