Skip to main content

4.1.4 RDK S Model Zoo Usage Guide

Branch and System Requirements

RDK S series (S100 / S600) uses the rdk_s branch as the main delivery branch. Recommended system version: RDK OS >= 4.0.5. Python samples in this branch uniformly use the hbm_runtime inference interface.

git clone https://github.com/D-Robotics/rdk_model_zoo.git
cd rdk_model_zoo
git checkout rdk_s
tip

Legacy demos for the RDK S series are retained in the RDK Model Zoo S repository. The rdk_s branch is the reorganized new version.

Repository Directory Structure

The rdk_s branch uses a standardized directory structure, organized by domain and model:

rdk_model_zoo/
|-- samples/
| |-- vision/ # Vision model examples
| | |-- lanenet/ # Lane detection
| | |-- mobilenetv2/ # Image classification
| | |-- paddle_ocr/ # OCR text recognition
| | |-- resnet18/ # Image classification
| | |-- unetmobilenet/ # Semantic segmentation
| | |-- yolo11/ # YOLO11 detection
| | |-- yolo11_pose/ # YOLO11 pose estimation
| | |-- yolo11_seg/ # YOLO11 instance segmentation
| | |-- yoloe11_seg/ # YOLOE11 instance segmentation
| | |-- yolov5/ # YOLOv5 detection
| | `-- ...
| `-- speech/ # Speech model examples
| `-- asr/ # Speech recognition
|-- datasets/ # Public datasets and sample data
|-- docs/ # Project specs and reference documentation
|-- tools/ # Conversion/build/utility tools
|-- tros/ # TROS integration guides and examples
`-- utils/ # Shared Python / C++ utilities

Individual Sample Structure

Each RDK S sample contains the following standardized directories:

sample_name/
|-- README.md # English documentation
|-- README_cn.md # Chinese documentation
|-- conversion/ # ONNX → HBM conversion configs
|-- evaluator/ # Accuracy and performance evaluation
|-- model/ # Pre-compiled .hbm models + download scripts
|-- runtime/
| |-- python/ # Python inference (main.py, <model>.py, run.sh)
| `-- cpp/ # C++ inference (src/main.cc, CMakeLists.txt, run.sh)
`-- test_data/ # Test images and inference results

Inference Interface

The RDK S series Python samples uniformly use the hbm_runtime inference interface, which shares the same interface name as RDK X5's hbm_runtime, but with different underlying dependencies: RDK S series is based on libhbucp, while RDK X5 is based on libdnn.

For the complete interface reference 👉 RDK S hbm_runtime Python API Documentation

C/C++ inference interface documentation: UCP (hb_ucp) Interface Documentation 👉 UCP Overview

hbm_runtime Basic Call Flow

Load Model

import hbm_runtime

model = hbm_runtime.HB_HBMRuntime("../../model/yolov5x_672x672_nv12.hbm")
model_name = model.model_names[0]
input_names = model.input_names[model_name]
output_names = model.output_names[model_name]
input_shapes = model.input_shapes[model_name]

Configure Scheduling Parameters

hbm_runtime supports specifying inference priority and BPU core:

model.set_scheduling_params(
priority={model_name: 0},
bpu_cores={model_name: [0]},
)

Command-line parameter equivalents:

--priority 0 --bpu-cores 0

Prepare Inputs

RDK S vision samples commonly use separated NV12 format (Y plane and UV plane as two separate inputs), unlike RDK X5's single packed NV12 input:

inputs = {
model_name: {
input_names[0]: y_plane, # Y plane
input_names[1]: uv_plane, # UV plane
}
}

Run Inference

outputs = model.run(inputs)
raw_outputs = outputs[model_name]
output_tensor = raw_outputs[output_names[0]]

Model Zoo Wrapper Flow

RDK S samples follow the Config + Model + predict() pattern:

config = YOLOv5Config(
model_path="../../model/yolov5x_672x672_nv12.hbm",
classes_num=80,
score_thres=0.25,
nms_thres=0.45,
)

model = YoloV5X(config)
model.set_scheduling_params(priority=0, bpu_cores=[0])
results = model.predict(image)

The wrapper executes in the following order:

  1. pre_process(): Generate model inputs (resize, BGR-to-NV12 with separated Y/UV planes)
  2. forward(): Call hbm_runtime.run()
  3. post_process(): Parse detection boxes, classification results, segmentation masks, or pose keypoints
  4. predict(): Chain the complete flow

Quick Start

Run the YOLOv5 Detection Sample

# Download model
cd samples/vision/yolov5/model
bash download_model.sh

# Run inference
cd ../runtime/python
python3 main.py \
--model-path ../../model/yolov5x_672x672_nv12.hbm \
--test-img ../../test_data/kite.jpg \
--label-file ../../test_data/coco_classes.names \
--img-save-path result.jpg

Using run.sh for One-Click Execution

Each sample provides a run.sh script in its runtime/python/ and runtime/cpp/ directories for one-click environment setup, model download, and inference:

# Python inference
cd samples/vision/yolov5/runtime/python
bash run.sh

# C++ inference
cd samples/vision/yolov5/runtime/cpp
bash run.sh

Model Coverage

Vision

CategoryModelSample DirectorySupported Platforms
Object DetectionYOLOv5xsamples/vision/yolov5S100 / S600
YOLO11samples/vision/yolo11S100 / S600
Instance SegmentationYOLO11-Segsamples/vision/yolo11_segS100 / S600
YOLOe11-Segsamples/vision/yoloe11_segS100
Pose EstimationYOLO11-Posesamples/vision/yolo11_poseS100 / S600
Image ClassificationResNet18samples/vision/resnet18S100 / S600
MobileNetV2samples/vision/mobilenetv2S100 / S600
Semantic SegmentationUnetMobileNetsamples/vision/unetmobilenetS100 / S600
Lane DetectionLaneNetsamples/vision/lanenetS100
Text RecognitionPaddleOCRsamples/vision/paddle_ocrS100

Speech

CategoryModelSample DirectorySupported Platforms
Speech RecognitionASRsamples/speech/asrS100 / S600

Shared Utilities (utils/)

The rdk_s branch provides the following shared Python utilities (utils/py_utils/):

Utility ModuleFunction
file_ioModel download, image loading, class name loading
preprocessBGR to NV12 (separated Y/UV planes), resize (direct/letterbox)
postprocessNMS, YOLO box/mask/keypoint decoding, coordinate scaling
visualizeDetection box, segmentation mask, pose keypoint, classification result rendering
inspectSoC name detection, model info printing
nn_mathSigmoid, z-score normalization