Deploying YOLO11s on Raspberry Pi 5 + Hailo-8 (Full Step-by-Step Guide)

In this guide we’ll go end-to-end from a YOLO11s PyTorch model on an x86 workstation to a Hailo-8 HEF running in real-time on a Raspberry Pi 5 + Hailo-8. The full pipeline is:

yolo11s.pt → yolo11s.onnx → yolov11s.hef → Raspberry Pi 5 + Hailo-8

We’ll use Ultralytics to export ONNX, MiniImageNet for calibration, the Hailo AI Software Suite & Model Zoo for compilation, and finally HailoRT on the Raspberry Pi for inference.

1. Workstation Setup (Ubuntu 22 + NVIDIA GPU)

On the x86 workstation (for example, a Dell Precision with a Quadro RTX 6000) install the NVIDIA driver and verify that the GPU is visible:

nvidia-smi

The GPU is not required for the Hailo compiler, but it’s useful for other deep learning tasks on the same machine.

2. Download YOLO11s and MiniImageNet

Create a working directory and place the model and dataset there:

mkdir -p ~/yolo11_train
cd ~/yolo11_train

# Files after download / extraction
yolo11s.pt
archive/        # MiniImageNet extracted here

The archive/ folder contains many class subdirectories with images. We will use them for calibration.

3. Build a Calibration Image Folder

Hailo’s compiler needs a folder of unlabeled RGB images for INT8 quantization (around 500–1000 is typical). We flatten MiniImageNet into a single calib/ directory:

cd ~/yolo11_train
mkdir calib

find archive -type f \( -iname '*.jpg' -o -iname '*.jpeg' -o -iname '*.png' \) \
    | shuf | head -n 1000 \
    | xargs -I{} cp "{}" calib/

4. Export YOLO11s to ONNX

Create a Python virtual environment and install Ultralytics:

cd ~/yolo11_train
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip ultralytics onnx onnxruntime

Export yolo11s.pt to ONNX (640×640 input size in this example):

python - << 'EOF'
from ultralytics import YOLO

model = YOLO("yolo11s.pt")
model.export(
    format="onnx",
    imgsz=640,
    opset=13,
    dynamic=False
)
EOF

After export you should see:

~/yolo11_train/
├── yolo11s.pt
├── yolo11s.onnx
└── calib/

5. Install Hailo AI Software Suite (Docker, x86)

On the Hailo Download Center choose: AI Software Suite → AI Software Suite → Linux → x86 → Docker.

Unzip and run the Docker environment:

unzip hailo8_ai_sw_suite_2025-10_docker.zip
cd hailo8_ai_sw_suite_2025-10_docker
./hailo_docker_run.sh

Inside the container, your prompt will look similar to: (hailo_virtualenv) hailo@...:/local/workspace$

6. Share ONNX & Calibration Data with the Container

The Hailo Docker image exposes /local/shared_with_docker/ as a convenient shared folder. Copy your ONNX and calibration images there (from the host or directly in Docker):

/local/shared_with_docker/
├── yolo11s.onnx
└── calib/

7. Clone and Install the Hailo Model Zoo

cd /local/workspace
git clone https://github.com/hailo-ai/hailo_model_zoo.git
cd hailo_model_zoo

pip install -r requirements.txt
pip install .

Confirm that the CLI is available:

hailomz --help

8. Use the Built-In YOLO11s Configuration

Recent versions of the Hailo Model Zoo already include YOLO11 configs. You can list them with:

find . -maxdepth 5 -type f -iname "yolov11*.yaml"

You should see, among others: hailo_model_zoo/cfg/networks/yolov11s.yaml. This file describes the network structure and post-processing for YOLO11s, so we don’t need to create a custom YAML file.

9. Compile YOLO11s ONNX → HEF

Now run the compilation. We pass the model name (yolov11s), our ONNX checkpoint and calibration folder, and the target architecture (hailo8).

cd /local/workspace/hailo_model_zoo

hailomz compile yolov11s \
  --ckpt /local/shared_with_docker/yolo11s.onnx \
  --calib-path /local/shared_with_docker/calib \
  --hw-arch hailo8

During this step you might see messages like “No GPU chosen and no suitable GPU found, falling back to CPU”. That’s expected: the Hailo Dataflow Compiler runs on the CPU and does not require your NVIDIA GPU.

10. Locate and Export the `.hef` File

When compilation finishes, search for the generated HEF file:

find . -maxdepth 7 -type f -iname "yolov11s*.hef"

You’ll typically find it in a path similar to: ./hailomz_compilation/yolov11s/hailo8/yolov11s.hef.

Copy it back to the shared directory:

cp ./hailomz_compilation/yolov11s/hailo8/yolov11s.hef \
   /local/shared_with_docker/

You now have a Hailo Executable File that can be deployed on any Hailo-8 device.

11. Copy the HEF to Raspberry Pi 5

On the host workstation, copy the HEF to your Raspberry Pi:

scp yolov11s.hef pi@<raspberry-pi-ip>:/home/pi/

12. Install HailoRT on Raspberry Pi 5

On the Raspberry Pi (running the official Raspberry Pi OS image with Hailo support), install the Hailo runtime:

sudo apt update
sudo apt install hailort

Confirm that the Hailo-8 module is detected:

hailortcli device-info

13. Minimal HailoRT Inference Skeleton

The following is a very small example of how you might start using yolov11s.hef on the Raspberry Pi with HailoRT. It omits the full YOLO post-processing for brevity, but shows the structure.

import cv2
import numpy as np
from hailo_platform import HEF, VDevice

hef = HEF("yolov11s.hef")
vd = VDevice()
network_group = vd.configure(hef)

input_vstream = network_group.get_input_vstream()
output_vstream = network_group.get_output_vstream()

cap = cv2.VideoCapture(0)

while True:
    ok, frame = cap.read()
    if not ok:
        break

    # Resize to model input resolution (640x640 here)
    resized = cv2.resize(frame, (640, 640))
    resized = resized.astype(np.uint8)
    resized = np.expand_dims(resized, axis=0)  # NCHW/NHWC depends on HEF, adjust if needed

    input_vstream.write(resized)
    outputs = output_vstream.read()

    # TODO: apply YOLO11 decoding / NMS on `outputs`
    # and draw bounding boxes on `frame`

    cv2.imshow("YOLO11s – Hailo-8", frame)
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

In a real project you would plug this logic into your existing Flask + Picamera2 streaming app by replacing the PyTorch-based inference function with a Hailo-backed one.

14. Integrating with a Flask Streaming App (High-Level)

Conceptually, you replace:

# Old: PyTorch/Ultralytics path
frame = run_yolo_on_frame(frame)

with:

# New: Hailo-8 path
frame = run_yolo11_hailo(frame)

where run_yolo11_hailo() sends the frame to the Hailo-8 via HailoRT, decodes the YOLO11 outputs, draws the boxes and returns the annotated frame.

15. Conclusion

You’ve gone from a yolo11s.pt PyTorch checkpoint to an optimized yolov11s.hef running on a Raspberry Pi 5 with Hailo-8. The heavy lifting happens once on the x86 workstation, and the deployed HEF runs efficiently on the low-power edge device.

This pipeline can be reused for other models supported by the Hailo Model Zoo—just swap the YOLO11 configuration and follow the same steps.

Compiling .HEF and Deploying YOLO11 on Raspberry Pi 5 + Hailo-8 (Full Step-by-Step Guide)