Files
2nd/10_Wiki/Topics/Coding/MLOps_Feature_Store.md
T
2026-05-09 22:47:42 +09:00

7.1 KiB

id, title, category, status, source_trust_level, verification_status, created_at, updated_at, tags, tech_stack, applied_in, aliases
id title category status source_trust_level verification_status created_at updated_at tags tech_stack applied_in aliases
mlops-feature-store Feature Store — Feast / Tecton / online & offline Coding draft B conceptual 2026-05-09 2026-05-09
mlops
feature-store
vibe-coding
language applicable_to
Python
AI
Backend
feature store
Feast
Tecton
online store
offline store
feature reuse

Feature Store

ML feature 의 central registry. Train / serve consistency, low-latency online, time-correct offline. Feast (open) / Tecton (managed).

📖 핵심 개념

  • Online store: 빠른 조회 (Redis / DynamoDB).
  • Offline store: 학습용 (Parquet / Snowflake).
  • Time-travel: 과거 시점 feature.
  • Reuse: 한 번 정의, 여러 model.

💻 코드 패턴

Feast 정의

# features.py
from feast import Entity, Feature, FeatureView, ValueType
from datetime import timedelta

user = Entity(name='user_id', value_type=ValueType.INT64)

user_features = FeatureView(
    name='user_features',
    entities=['user_id'],
    ttl=timedelta(days=1),
    features=[
        Feature(name='age', dtype=ValueType.INT32),
        Feature(name='total_spent', dtype=ValueType.FLOAT),
        Feature(name='days_active', dtype=ValueType.INT32),
    ],
    source=parquet_source,
)

등록

feast apply
# → Online + offline schema 생성

Materialize (offline → online)

feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")
# → 최신 feature → online store (Redis)

→ Cron / Airflow 가 매일 실행.

Online get (serving)

from feast import FeatureStore
store = FeatureStore(repo_path='.')

features = store.get_online_features(
    features=['user_features:age', 'user_features:total_spent'],
    entity_rows=[{'user_id': 123}],
).to_dict()
# {'age': [25], 'total_spent': [100.5]}

→ Redis 가 backend = ms latency.

Historical get (training)

import pandas as pd
entity_df = pd.DataFrame({
    'user_id': [123, 456, 789],
    'event_timestamp': [t1, t2, t3],
})

train_df = store.get_historical_features(
    entity_df=entity_df,
    features=['user_features:age', 'user_features:total_spent'],
).to_df()

→ Time-correct: t1 시점의 user 123 feature.

Train / serve consistency

# Train (offline)
df = store.get_historical_features(...).to_df()
model.fit(df)

# Serve (online)
features = store.get_online_features(...).to_dict()
pred = model.predict([features])

# → 같은 transformation, 같은 schema = 일관.

→ 가장 큰 가치.

Time-travel join

Feature: user_total_spent (시간 따라 변경)
Event: 2026-05-01 user 123 click

→ get historical = "2026-05-01 시점의 user 123 spent" (그 후 변경 X)

→ Data leakage 방지.

Tecton (managed)

@stream_feature_view(
    source=kafka_source,
    entities=[user],
    mode='spark_sql',
    aggregations=[
        Aggregation(column='amount', function='sum', time_window=timedelta(days=1)),
    ],
)
def user_daily_spend(events):
    return f"SELECT user_id, amount, ts FROM {events}"

→ Streaming + windowed aggregation 지원.

Real-time aggregation

# Streaming feature
@stream_feature_view(
    source=kafka,
    aggregations=[
        Aggregation(column='clicks', function='count', time_window=timedelta(hours=1)),
        Aggregation(column='clicks', function='count', time_window=timedelta(days=1)),
    ],
)
def user_clicks(events): ...

→ "지난 1시간 click 수" 가 자동 maintain.

Composition

# Combine
@feature_view(...)
def user_combined(user_features, item_features):
    return user_features.join(item_features, on='user_id')

Feature versioning

@feature_view(version='v2')
def user_features(...): ...

# v1 + v2 동시 — model 별로 사용.

Push (real-time)

# Event 발생 직후
store.push('user_clicks', {'user_id': 123, 'clicks': 5, 'event_timestamp': now})

→ Online store 즉시 update.

Drift (data validation)

# Great Expectations + Feast
from feast.data_quality import expectation

@feature_view(...)
class UserFeatures:
    age = Feature(
        dtype=ValueType.INT32,
        expectations=[expect_column_values_to_be_between('age', 0, 120)],
    )

Cost

Online: Redis / DynamoDB — pay per Read.
Offline: Parquet on S3 — cheap.

Tecton: managed — $$$, 큰 팀.
Feast: open — infra 직접.

Hopsworks (alternative)

- Free + open
- Streaming + batch
- Built-in model registry

Vertex AI Feature Store

from google.cloud import aiplatform_v1
client = aiplatform_v1.FeaturestoreOnlineServingServiceClient()

response = client.read_feature_values(
    entity_type='projects/.../entityTypes/user',
    entity_id='123',
    feature_selector={'ids': ['age', 'total_spent']},
)

SageMaker Feature Store

from sagemaker.feature_store.feature_group import FeatureGroup

fg = FeatureGroup(name='user-features', sagemaker_session=session)
fg.create(record_identifier_name='user_id', event_time_feature_name='ts', ...)

# Online get
client.get_record(
    FeatureGroupName='user-features',
    RecordIdentifierValueAsString='123',
)

Direct DB (no Feast)

-- Materialized view 가 single source.
CREATE MATERIALIZED VIEW user_features AS
SELECT
  user_id,
  age,
  COUNT(orders) as order_count,
  SUM(amount) as total_spent
FROM users LEFT JOIN orders USING (user_id)
GROUP BY user_id;

-- Train: SELECT * FROM user_features WHERE ts < ?
-- Serve: SELECT * FROM user_features WHERE user_id = ?

→ 작은 ML system 가 충분.

Feature 가 reused

3 model 가 같은 'user_total_spent' 사용.
- 정의 1번
- 매 model 가 reference

→ 변경 한 곳, 전체 효과.

Naming convention

{entity}_{aggregation}_{time}

user_clicks_1h
user_avg_session_7d
item_views_30d

Consistency checks

# Train data 와 prod data 의 분포 비교
train_age = pd.read_parquet('train.parquet')['age']
prod_age = client.fetch_recent_features('age', n=10000)

assert ks_2samp(train_age, prod_age).pvalue > 0.01

When 안 필요

- 1 model + 1 simple feature
- POC / 작은 demo
- Real-time stateless feature 만 (input → pred)

🤔 의사결정 기준

상황 추천
작은 / 1-2 model Direct DB / materialized view
Open / self-host Feast
Streaming + windowed Tecton / Hopsworks
GCP Vertex AI
AWS SageMaker
Minute-level real-time Streaming (Tecton / Hopsworks)
Daily batch Feast + cron

안티패턴

  • Train / serve schema 다름: silent error.
  • No time-travel: data leakage.
  • Online TTL 없음: stale.
  • Materialize 안 함: latency 큰.
  • Feature 정의 흩어짐: drift.
  • Push + batch + 다른 logic: 의도 X.
  • Privacy 무시: PII 가 store 에.

🤖 LLM 활용 힌트

  • Feature store 가 train/serve consistency 의 답.
  • Time-travel = data leakage 방지.
  • 작은 system 가 materialized view 충분.
  • Streaming + window 가 필요 시 Tecton.

🔗 관련 문서