λͺ©ν
- MLflow Tracking Server ꡬμ±
- μ€ν(Experiment), νλΌλ―Έν°, λ©νΈλ¦, μν°ν©νΈ κΈ°λ‘
- λͺ¨λΈ λ±λ‘ β Stage μ΄λ β API μ°λκΉμ§
π μ€μ΅ μ½λλ π GitHub (Mlflow - Tracking + FastAPI)
π§ μ€μ΅ μ 체 νλ¦ μμ½
[1λ¨κ³] MLflow Tracking Server κ΅¬μ± (λ‘컬 νκ²½μμ μ€ν)
[2λ¨κ³] μ€ν μ€ν (train.py) β λͺ¨λΈ νμ΅, κΈ°λ‘
[3λ¨κ³] λͺ¨λΈ λ±λ‘ λ° Stage μ€μ (Production μ΄λ)
[4λ¨κ³] FastAPI μ°λ β μμΈ‘ API μλΉμ€
π§© μ€μ΅ λλ ν 리 μμ
mlops-mlflow/
βββ app/
β βββ train.py # λͺ¨λΈ νλ ¨ λ° μ€ν κΈ°λ‘
β βββ model.pkl # μ μ₯λ λͺ¨λΈ
βββ mlruns/ # μ€ν λ°μ΄ν° μλ μμ±
βββ fastapi_app/
β βββ app.py # FastAPI μμΈ‘ API
βββ Dockerfile (μ ν)
βββ README.md
β [1λ¨κ³] MLflow μ€μΉ & μ€ν
π οΈ κ°μ νκ²½ μ€μ
# 1. venv μ€μΉ
sudo apt install python3-venv -y
# 2. κ°μνκ²½ μμ±
python3 -m venv .venv
# 3. κ°μνκ²½ νμ±ν
source .venv/bin/activate
# 4. ν¨ν€μ§ μ€μΉ
pip install mlflow scikit-learn pandas fastapi uvicorn
# 5. λκ° λ
deactivate
π§ MLflow μλ² μ€ν
mlflow ui --port 5000 # http://localhost:5000 μμ UI νμΈ
π§ͺ [2λ¨κ³] μ€ν μ€ν (train.py)
# app/train.py
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# MLflow μ€μ
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("iris-rf-exp")
with mlflow.start_run():
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
clf = RandomForestClassifier(n_estimators=100, max_depth=3)
clf.fit(X_train, y_train)
acc = clf.score(X_test, y_test)
mlflow.log_param("n_estimators", 100)
mlflow.log_param("max_depth", 3)
mlflow.log_metric("accuracy", acc)
mlflow.sklearn.log_model(clf, "model")
# μ€ν μ€ν
python app/train.py
μ€νμ΄ λλλ©΄ mlruns/ ν΄λμ μ€ν κΈ°λ‘ λ° λͺ¨λΈμ΄ μ μ₯
π§± [3λ¨κ³] λͺ¨λΈ λ±λ‘ λ° Stage μ€μ
- λΈλΌμ°μ μμ
http://localhost:5000μ μ - μ€νμμ μμ±λ Run β βRegister Modelβ ν΄λ¦
- Model Registry β
iris-rfλͺ¨λΈ λ±λ‘ - λͺ¨λΈμ Stage(Production) μ€μ
models:/iris-rf@productionνμμΌλ‘ νΈμΆ
π [4λ¨κ³] FastAPIλ‘ μμΈ‘ API μ°λ
# fastapi_app/app.py
import mlflow
import mlflow.pyfunc
from fastapi import FastAPI
from pydantic import BaseModel
import pandas as pd
# MLflow URI μ€μ
mlflow.set_tracking_uri("http://localhost:5000") # MLflow μλ² URI μ€μ
# FastAPI μΈμ€ν΄μ€ μμ±
app = FastAPI()
# MLflow λͺ¨λΈ λ‘λ
model = mlflow.pyfunc.load_model("models:/iris-rf@production") # λͺ¨λΈ aliasλ₯Ό μ΄μ©
# μ
λ ₯ λ°μ΄ν° ꡬ쑰 μ μ
class InputData(BaseModel):
features: list # 4κ°μ νΉμ±κ°μ λ°μ
# μμΈ‘ API μλν¬μΈνΈ
@app.post("/predict")
def predict(data: InputData):
input_df = pd.DataFrame([data.features], columns=["sepal_length", "sepal_width", "petal_length", "petal_width"])
pred = model.predict(input_df)
return {"prediction": int(pred[0])}
# FastAPI μλ² μ€ν
uvicorn fastapi_app.app:app --reload --port 8000
μμΈ‘ API ν μ€νΈ
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
μμΈ‘ κ²°κ³Ό: { “prediction”: 0 } β setosa νμ’
π― ν΅μ¬ μμ½
| νλͺ© | μ€λͺ |
|---|---|
| MLflow | μ€ν κΈ°λ‘, λͺ¨λΈ κ΄λ¦¬, νλΌλ―Έν°/λ©νΈλ¦ λ‘κΉ |
| FastAPI | λͺ¨λΈμ APIλ‘ μ 곡νλ μΉ νλ μμν¬ |
| Model Registry | λͺ¨λΈ λ²μ κ΄λ¦¬ λ° Stage μ€μ |
| μ€ν κΈ°λ‘ | mlruns/ λλ ν 리μ μλ μ μ₯ |
π§© μ€λ¬΄ ν
- MLflowλ‘ λͺ¨λΈ λ²μ κ΄λ¦¬ λ° Stage μ€μ μ΄ κ°λ₯ν΄μ Έ λ°°ν¬κ° μ¬μμ§
- FastAPIλ‘ λΉ λ₯Έ νλ‘ν νμ κ³Ό λ°°ν¬κΉμ§ μ΄μ΄μ§λ μμ ν MLOps νμ΄νλΌμΈ κ΅¬μΆ κ°λ₯
π§ MLOps μ€μ μ°κ²°
| μ€λ¬΄ μν© | Kubernetes μ¬μ© λ°©μ |
|---|---|
| λͺ¨λΈ νΈλνΉ & κ΄λ¦¬ | MLflowλ‘ λͺ¨λΈ μΆμ , GitOpsμ μ°λ κ°λ₯ |
| API μ°λ | FastAPIλ‘ μλΉμ€ λ°°ν¬ λ° μ€μκ° μμΈ‘ API μ 곡 |
| λͺ¨λΈ λ²μ κ΄λ¦¬ | Model Registryλ₯Ό ν΅ν΄ λ°°ν¬ νκ²½ λΆλ¦¬ |