์ด ๊ธ€์—์„œ ๋‹ค๋ฃจ๋Š” ๊ฒƒ

Feature Store-lite ์œ„์— Feast๋ฅผ ์–น์–ด, S3 Offline + Redis Online + Feature Server ๊ตฌ์„ฑ์œผ๋กœ “์ €์žฅํ•˜๋Š” ํŒŒ์ดํ”„๋ผ์ธ"์„ “์กฐํšŒ ๊ฐ€๋Šฅํ•œ ํ”ผ์ฒ˜ ํ”Œ๋žซํผ"์œผ๋กœ ํ™•์žฅํ•˜๋Š” ๊ณผ์ •

์„ ์ˆ˜์ง€์‹


์ด ๋‹จ๊ณ„์—์„œ ํ•ด๊ฒฐํ•˜๋ ค๋Š” ๋ฌธ์ œ

์ด์ „ ๊ธ€(Feature Store-lite)์—์„œ ๊ณ„์•ฝ(์Šคํ‚ค๋งˆ/๋ฉ”ํƒ€) + ๋ฒ„์ „ํ™” ์ €์žฅ + ์žฌํ˜„์„ฑ๊นŒ์ง€ ๊ณ ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ์‹ค๋ฌด์—์„œ๋Š” “์ €์žฅ"์—์„œ ๋๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ตญ ์ค‘์š”ํ•œ ๊ฑด ์กฐํšŒ ๊ฐ€๋Šฅ(Serving-ready) ์ƒํƒœ์ž…๋‹ˆ๋‹ค.

์ด๋ฒˆ ๊ธ€์€ Feature Store-lite ์œ„์— Feast๋ฅผ ์–น์–ด ์•„๋ž˜๋ฅผ ์™„์„ฑํ•ฉ๋‹ˆ๋‹ค.

  • Offline Source: S3์˜ latest/features.parquet (Feast๊ฐ€ ์ฝ๋Š” ๊ณ ์ • ํฌ์ธํ„ฐ)
  • Registry: S3์— registry.pb ์ €์žฅ(ํ™˜๊ฒฝ๋ณ„ ๋ถ„๋ฆฌ)
  • Online Store: Redis ์ ์žฌ(materialize)๋กœ ์˜จ๋ผ์ธ ์กฐํšŒ ๊ฐ€๋Šฅ
  • Feature Server: ์ƒ์‹œ ์„œ๋น„์Šค + startup ์‹œ feast apply

์ฆ‰, “์ €์žฅํ•˜๋Š” ํŒŒ์ดํ”„๋ผ์ธ” -> “์กฐํšŒ ๊ฐ€๋Šฅํ•œ ํ”ผ์ฒ˜ ํ”Œ๋žซํผ"์œผ๋กœ ํ™•์žฅํ•˜๋Š” ๋‹จ๊ณ„์ž…๋‹ˆ๋‹ค.


๐ŸŽฏ ์™„๋ฃŒ ๊ธฐ์ค€

  • GitOps๋กœ Feast ๊ณ„์•ฝ(repo.py/feature_store.yaml) ๋ฐฐํฌ
  • Airflow๋กœ S3์— version ์œ ์ง€ + latest overwrite ์ €์žฅ
  • Feast materialize ์‹คํ–‰ -> Redis online ์ ์žฌ
  • Feast SDK๋กœ online ์กฐํšŒ ๊ฒฐ๊ณผ๊ฐ€ None์ด ์•„๋‹Œ ์‹ค์ œ ๊ฐ’์œผ๋กœ ํ™•์ธ

1๏ธโƒฃ ์ „์ฒด ๊ตฌ์กฐ

mermaid-feast-01.png


2๏ธโƒฃ ์ฝ”๋“œ/๋ฆฌ์†Œ์Šค ํŠธ๋ฆฌ

(A) GitOps (mlops-infra-gitops)

charts/
  feast/
    templates/
      feast-repo-configmap.yaml
      feast-server-deployment.yaml
      feast-server-service.yaml
      redis-deployment.yaml
      redis-service.yaml
    values/
      base.yaml
      dev.yaml
      prod.yaml

apps/
  feast-dev.yaml
  feast-prod.yaml
# ... (sealed-secrets ์ƒ๋žต)

(B) Airflow (airflow-dags-dev)

dags/
  dag_data_pipeline_daily_dev_v5.py
  mlops_lib/
    dp/
      build.py
      store.py
  .airflowignore

3๏ธโƒฃ ์„ค๊ณ„ ํฌ์ธํŠธ (์šด์˜ ๊ด€์ )

(1) Feast repo ์ž์ฒด๋ฅผ “๊ณ„์•ฝ(Contract)“์œผ๋กœ ์ทจ๊ธ‰

Feast์—์„œ ์‹ค์งˆ์ ์ธ ๊ณ„์•ฝ์€ repo.py + feature_store.yaml์ž…๋‹ˆ๋‹ค. ์ด๊ฑธ ConfigMap์œผ๋กœ GitOps ๋ฐฐํฌํ•˜๋ฉด:

  • dev/prod์—์„œ ๋™์ผ ๊ณ„์•ฝ ์œ ์ง€
  • ์ฝ”๋“œ ์ˆ˜์ • ์—†์ด ๋ฐฐํฌ/๋กค๋ฐฑ ๊ฐ€๋Šฅ
  • “์šด์˜ ๋ฆฌ์†Œ์Šค"๋กœ ์ถ”์  ๊ฐ€๋Šฅ

(2) S3 ์ €์žฅ ์ •์ฑ…์€ “๋ฒ„์ „ + latest ํฌ์ธํ„ฐ"๋กœ ๋๋‚ธ๋‹ค

  • .../<feature_set>/<version>/ : ์žฌํ˜„์„ฑ
  • .../<feature_set>/latest/ : ์šด์˜ ํŽธ์˜

Feast Offline์€ latest๋งŒ ์ฝ์œผ๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ์‚ฌ์šฉ์ž๋Š” ๋ฒ„์ „์„ ๋ชฐ๋ผ๋„ ๋ฉ๋‹ˆ๋‹ค. ๋ฒ„์ „์€ “์žฌํ˜„"์ด ํ•„์š”ํ•  ๋•Œ๋งŒ ๊บผ๋‚ด ์“ฐ๋Š” ๊ตฌ์กฐ๊ฐ€ ๊ฐ€์žฅ ๋‹จ๋‹จํ•ฉ๋‹ˆ๋‹ค.

(3) Airflow๋Š” ์ƒ์„ฑ/์ €์žฅ๊นŒ์ง€๋งŒ, Online ์ ์žฌ๋Š” Feast๊ฐ€ ๋‹ด๋‹น

Airflow ์ด๋ฏธ์ง€์— feast/pyarrow/s3fs๊นŒ์ง€ ์–น๊ธฐ ์‹œ์ž‘ํ•˜๋ฉด ์šด์˜ ๋‚œ์ด๋„๊ฐ€ ๊ธ‰์ƒ์Šนํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์—ญํ• ์„ ๋‚˜๋ˆด์Šต๋‹ˆ๋‹ค.

  • Airflow: feature ์ƒ์„ฑ + S3 ์ €์žฅ(๋ฒ„์ „/latest)
  • Feast: materialize + online(redis) ์ ์žฌ + ์กฐํšŒ

4๏ธโƒฃ GitOps: Feast ๋ฐฐํฌ (Redis + Feature Server + Repo ConfigMap)

4-1) values (base/dev/prod)

# charts/feast/values/base.yaml
aws:
  region: ap-northeast-2
  credentialsSecretName: aws-credentials

s3:
  bucket: datapipeline-raw-data-keonho
  featurePrefix: feature-store/user_features
  registryPrefix: feast

feastServer:
  image: hoizz/feast-server:0.40.1-s3fs
  port: 6566

redis:
  image: redis:7.2-alpine
  persistence: true

4-2) Feast repo ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: feast-repo
  namespace: {{ .Release.Namespace }}
data:
  feature_store.yaml: |
    project: feature_store_{{ .Values.env }}
    registry:
      path: s3://{{ .Values.s3.bucket }}/{{ .Values.s3.registryPrefix }}/{{ .Values.env }}/registry.pb
    provider: local
    offline_store:
      type: file
    online_store:
      type: redis
      connection_string: redis.{{ .Release.Namespace }}.svc.cluster.local:6379
    entity_key_serialization_version: 2

  repo.py: |
    from datetime import timedelta
    from feast import Entity, FeatureView, Field
    from feast.infra.offline_stores.file_source import FileSource
    from feast.types import Int64, Float64

    user_features_source = FileSource(
        path="s3://{{ .Values.s3.bucket }}/{{ .Values.s3.featurePrefix }}/latest/features.parquet",
        timestamp_field="event_timestamp",
    )
    # ... (Entity, FeatureView ์ •์˜ ์ƒ๋žต)

4-3) Feast ์„œ๋ฒ„ Deployment (subPath + sh ํ˜ธํ™˜ + startup apply)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: feast-server
  namespace: {{ .Release.Namespace }}
spec:
  replicas: 1
  selector:
    matchLabels:
      app: feast-server
  template:
    spec:
      containers:
        - name: feast-server
          image: {{ .Values.feastServer.image }}
          volumeMounts:
            - name: feast-repo
              mountPath: /feast-repo/feature_store.yaml
              subPath: feature_store.yaml
              readOnly: true
            - name: feast-repo
              mountPath: /feast-repo/repo.py
              subPath: repo.py
              readOnly: true
            # ... (aws-credentials mount ์ƒ๋žต)
          command: ["/bin/sh","-c"]
          args:
            - |
              set -eux
              cd /feast-repo
              feast apply
              feast serve --host 0.0.0.0 --port 6566
      # ... (volumes ์ƒ๋žต)

5๏ธโƒฃ s3fs ํฌํ•จ Feast ์ด๋ฏธ์ง€ ์ค€๋น„ (ํ•„์ˆ˜)

Feast materialize๊ฐ€ S3 parquet๋ฅผ ์ฝ์œผ๋ ค๋ฉด s3fs๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

FROM feastdev/feature-server:0.40.1
RUN pip install --no-cache-dir s3fs fsspec
docker build -t hoizz/feast-server:0.40.1-s3fs .
docker push hoizz/feast-server:0.40.1-s3fs

6๏ธโƒฃ Airflow: “๋ฒ„์ „ํ™” ์ €์žฅ + latest overwrite” (parquet ๊ธฐ์ค€)

Feast๋Š” Offline์„ latest/features.parquet๋กœ ๊ณ ์ •ํ•ด ์ฝ๊ธฐ ๋•Œ๋ฌธ์—, Airflow๊ฐ€ latest๋ฅผ ๋งค๋ฒˆ ๊ฐฑ์‹ ํ•ด์ค˜์•ผ ํ•ฉ๋‹ˆ๋‹ค.

(1) DAG: ํ๋ฆ„๋งŒ ๋‹ด๋‹น

# dags/dag_data_pipeline_daily_dev_v5.py (์š”์•ฝ)
with DAG(
    dag_id="data_pipeline_daily_dev_v5",
    default_args=DEFAULT_ARGS,
    start_date=datetime(2026, 1, 1),
    schedule="0 0 * * *",
    catchup=False,
    tags=["dp", "feature-store"],
) as dag:
    t_build = PythonOperator(task_id="build_features", python_callable=build_features)
    t_store = PythonOperator(
        task_id="store_features",
        python_callable=store_features,
        # ... (op_kwargs ์ƒ๋žต)
    )

    t_build >> t_store

(2) build: parquet/csv + event_timestamp ๋ณด์žฅ

  • Feast ๊ธฐ์ค€: event_timestamp ํ•„์ˆ˜
  • build ๋‹จ๊ณ„์—์„œ Feast ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑ์‹œํ‚ค๋ฉด ํŒŒ์ดํ”„๋ผ์ธ์ด ์•ˆ์ •ํ•ด์ง‘๋‹ˆ๋‹ค.

(3) store: version + latest ๋™์‹œ ์ €์žฅ(4ํŒŒ์ผ ์„ธํŠธ)

  • .../<feature_set>/<version>/ ์ €์žฅ (์žฌํ˜„์„ฑ)
  • .../<feature_set>/latest/ overwrite (์šด์˜ ํŽธ์˜)
  • ๋™์ผ prefix์— parquet/csv/schema/metadata๋ฅผ ๋ฌถ์–ด์„œ ์ €์žฅ

7๏ธโƒฃ Feast ์šด์˜/๊ฒ€์ฆ ๋ฃจํ‹ด (์„ฑ๊ณต ์ฒดํฌ)

7-1) s3fs ํฌํ•จ ์—ฌ๋ถ€ ํ™•์ธ

POD=$(kubectl -n feature-store-dev get pod -l app=feast-server \
  -o jsonpath='{.items[0].metadata.name}')
kubectl -n feature-store-dev exec -it "$POD" -- sh -lc '
python - << "PY"
import s3fs, fsspec
print("s3fs OK:", s3fs.__version__)
PY
'

7-2) materialize ์‹คํ–‰ (์˜ˆ: ์ตœ๊ทผ 2์ผ)

kubectl -n feature-store-dev exec -it "$POD" -- sh -lc '
cd /feast-repo
feast materialize \
  "$(date -u -d "2 days ago" +%Y-%m-%dT%H:%M:%S)" \
  "$(date -u +%Y-%m-%dT%H:%M:%S)"
'

7-3) online ์กฐํšŒ ํ™•์ธ

kubectl -n feature-store-dev exec -it "$POD" -- sh -lc '
python - << "PY"
from feast import FeatureStore
store = FeatureStore("/feast-repo")
resp = store.get_online_features(
  features=["user_features:f_total_events_7d"],
  entity_rows=[{"user_id": 1001}],
)
print(resp.to_df())
PY
'

8๏ธโƒฃ ํŠธ๋Ÿฌ๋ธ”์ŠˆํŒ…

  • ConfigMap ์ „์ฒด mount๋กœ repo๊ฐ€ ๊ผฌ์ž„ -> subPath๋กœ ํŒŒ์ผ๋งŒ ๋งˆ์šดํŠธ
  • /bin/sh์—์„œ pipefail ๊นจ์ง -> set -eux๋กœ ๊ณ ์ •
  • materialize ์‹œ “Install s3fs” ์˜ค๋ฅ˜ -> s3fs ํฌํ•จ ์ปค์Šคํ…€ ์ด๋ฏธ์ง€๋กœ ๊ต์ฒด
  • image repo ์ด๋ฆ„ ์˜คํƒ€๋กœ ErrImagePull -> values/base.yaml์˜ image repo ์ •ํ™•ํžˆ ๊ต์ •

์„ค๊ณ„ ํŒ๋‹จ (Why This Way?)

์Šคํ‚ค๋งˆยท๋ฒ„์ „ํ™”๊ฐ€ ์„ ํ–‰๋œ ์ƒํƒœ์—์„œ Feast๋ฅผ ์กฐํšŒ ๊ณ„์ธต์œผ๋กœ๋งŒ ์ถ”๊ฐ€ํ•˜์—ฌ ๋„์ž… ๋ฒ”์œ„๋ฅผ ์ตœ์†Œํ™”ํ–ˆ๊ณ , latest ๊ณ ์ • ๊ฒฝ๋กœ + version ๋””๋ ‰ํ† ๋ฆฌ ์ด์ค‘ ๊ตฌ์กฐ๋กœ Feast ์„ค์ • ๋ณ€๊ฒฝ ์—†์ด ์ตœ์‹  ํ”ผ์ฒ˜ ์ ‘๊ทผ๊ณผ ์žฌํ˜„์„ฑ์„ ๋™์‹œ์— ํ™•๋ณดํ–ˆ์Šต๋‹ˆ๋‹ค. Airflow์™€ Feast์˜ ์—ญํ• ์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์ด๋ฏธ์ง€ ์˜์กด์„ฑ ์ถฉ๋Œ๊ณผ ์žฅ์•  ๋ฒ”์œ„๋ฅผ ๊ฒฉ๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค.


๋‹ค์Œ์— ์ฝ์„ ๊ธ€

โ†’ MLflow Model Registry: alias ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ๋ฒ„์ „ ๊ด€๋ฆฌ โ€” promotion/shadow alias ์ „๋žต