Triển khai mô hình YOLOv12 với Docker: Hướng dẫn toàn diện

Docker là giải pháp lý tưởng để triển khai các mô hình học sâu như YOLOv12 — phiên bản mới nhất trong dòng mô hình phát hiện đối tượng của Ultralytics. Thay vì đối mặt với những xung đột về phiên bản CUDA, mâu thuẫn giữa các gói phụ thuộc hay ô nhiễm môi trường hệ thống, bạn có thể đóng gói toàn bộ stack suy luận vào một container nhẹ, cô lập và tái sử dụng được.

1. Chuẩn bị hệ thống

Các yêu cầu tối thiểu:

Hệ điều hành: Ubuntu 20.04+, Rocky Linux 8+, hoặc Windows 11 (dùng WSL2)
Bộ nhớ RAM: 8 GB (khuyến nghị ≥16 GB cho xử lý đa ảnh)
Dung lượng ổ cứng trống: ≥25 GB
GPU NVIDIA (tùy chọn nhưng cần thiết nếu muốn tăng tốc suy luận)

Cài đặt Docker và công cụ hỗ trợ GPU:

# Trên Ubuntu/Debian
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker

# Cài NVIDIA Container Toolkit
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L "https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list" | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

# Kiểm tra
docker run --rm --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi

2. Thiết kế cấu trúc dự án

Một cấu trúc rõ ràng giúp quản lý dễ dàng hơn:

yolo-v12-deploy/
├── Dockerfile
├── requirements.txt
├── src/
│   ├── inference.py     # Script suy luận đơn giản
│   └── api_server.py    # Dịch vụ FastAPI
├── weights/             # Thư mục lưu trọng số
├── samples/             # Ảnh mẫu đầu vào
└── results/             # Kết quả đầu ra

Tạo thư mục:

mkdir -p yolo-v12-deploy/{src,weights,samples,results}
cd yolo-v12-deploy

3. Viết Dockerfile tối ưu

Sử dụng image PyTorch chính thức với hỗ trợ CUDA 12.1 và cuDNN 8.9:

FROM pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime

WORKDIR /workspace

# Cài đặt thư viện hệ thống cần thiết cho OpenCV
RUN apt-get update && \
    apt-get install -y libsm6 libxext6 libglib2.0-0 libgtk-3-0 && \
    rm -rf /var/lib/apt/lists/*

# Sao chép và cài đặt phụ thuộc Python
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Sao chép mã nguồn
COPY src/ ./src/

# Tạo thư mục dữ liệu
RUN mkdir -p weights samples results

# Thiết lập biến môi trường
ENV PYTHONPATH=/workspace
ENV WEIGHTS_DIR=/workspace/weights
ENV SAMPLES_DIR=/workspace/samples

# Mở cổng cho API
EXPOSE 8000

# Chạy dịch vụ web mặc định
CMD ["python", "src/api_server.py"]

4. Quản lý phụ thuộc

Tập tin requirements.txt:

ultralytics==8.1.0
opencv-python-headless==4.8.1.78
numpy==1.26.2
torchvision==0.16.0
pillow==10.1.0
fastapi==0.109.2
uvicorn[standard]==0.25.0

5. Triển khai suy luận cơ bản

Tập tin src/inference.py:

import argparse
import cv2
from pathlib import Path
from ultralytics import YOLO

def run_detection(input_path: str, output_path: str, weight_path: str):
    model = YOLO(weight_path)
    
    # Suy luận với tham số tối ưu hóa
    result = model(
        source=input_path,
        imgsz=640,
        conf=0.45,
        iou=0.5,
        device="cuda:0",
        half=True,
        verbose=False
    )[0]
    
    # Lưu ảnh kết quả
    result.save(output_path)
    
    # In thông tin phát hiện
    for box in result.boxes:
        cls_id = int(box.cls.item())
        score = float(box.conf.item())
        label = result.names[cls_id]
        print(f"[{label}] độ tin cậy: {score:.3f}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--input", required=True)
    parser.add_argument("--output", required=True)
    parser.add_argument("--weight", default="/workspace/weights/yolo12n.pt")
    args = parser.parse_args()
    
    run_detection(args.input, args.output, args.weight)

6. Xây dựng và chạy container

Xây dựng image:

docker build -t yolo12-prod .

Chạy container với GPU và gắn volume:

docker run -it \
  --gpus all \
  --shm-size=8g \
  -v $(pwd)/weights:/workspace/weights \
  -v $(pwd)/samples:/workspace/samples \
  -v $(pwd)/results:/workspace/results \
  yolo12-prod \
  python src/inference.py \
    --input /workspace/samples/test.jpg \
    --output /workspace/results/out.jpg

7. Triển khai API RESTful

Tập tin src/api_server.py:

from fastapi import FastAPI, UploadFile, File
from fastapi.responses import JSONResponse, StreamingResponse
from ultralytics import YOLO
import numpy as np
import cv2
import io
import os

app = FastAPI(title="YOLOv12 Detection Service")

# Khởi tạo mô hình khi ứng dụng khởi động
model = None

@app.on_event("startup")
async def load_model():
    global model
    weight_path = os.getenv("WEIGHTS_DIR", "/workspace/weights") + "/yolo12n.pt"
    if not os.path.exists(weight_path):
        raise RuntimeError(f"Không tìm thấy tệp mô hình tại {weight_path}")
    model = YOLO(weight_path)

@app.post("/predict")
async def predict_image(file: UploadFile = File(...)):
    content = await file.read()
    nparr = np.frombuffer(content, np.uint8)
    img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    
    # Suy luận
    result = model(img, conf=0.5, iou=0.45, verbose=False)[0]
    
    # Vẽ bounding box lên ảnh
    annotated = result.plot()
    
    # Trả về ảnh đã xử lý dưới dạng stream
    _, buffer = cv2.imencode(".jpg", annotated)
    return StreamingResponse(
        io.BytesIO(buffer.tobytes()),
        media_type="image/jpeg"
    )

@app.get("/health")
def health_check():
    return {"status": "healthy", "model_loaded": model is not None}

Chạy dịch vụ API:

docker run -d \
  --name yolo12-api \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/weights:/workspace/weights \
  yolo12-prod

Truy cập http://localhost:8000/docs để kiểm tra Swagger UI và thử nghiệm endpoint.

8. Mẹo triển khai thực tế

Tối ưu GPU: Dùng --shm-size=8g để tránh lỗi chia sẻ bộ nhớ khi xử lý batch lớn
Giảm kích thước image: Đặt imgsz=320 cho thiết bị edge hoặc latency thấp
Chọn phiên bản mô hình: yolo12n.pt (nhanh nhất), yolo12m.pt (cân bằng), yolo12x.pt (chính xác nhất)
Log và giám sát: Gắn volume cho thư mục logs hoặc dùng docker logs -f yolo12-api

9. Triển khai quy mô lớn với Docker Compose

Tập tin docker-compose.yml:

version: '3.9'
services:
  detector:
    image: yolo12-prod
    ports: ["8000:8000"]
    volumes:
      - ./weights:/workspace/weights
      - ./samples:/workspace/samples
      - ./results:/workspace/results
    deploy:
      resources:
        limits:
          memory: 6G
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

Khởi động dịch vụ:

docker compose up -d

Thẻ: docker yolov12 Ultralytics FastAPI PyTorch

Đăng vào ngày 25 tháng 6 lúc 19:30

Thành phố Cuồng loạn

Triển khai mô hình YOLOv12 với Docker: Hướng dẫn toàn diện