Skip to main content

人手关键点及手势识别(mediapipe)

功能介绍

人手关键点检测算法示例订阅图片和包含人手框信息的智能msg,利用BPU进行算法推理,发布包含人手关键点和手势信息的算法msg。

人手关键点索引如下图:

代码仓库:

(https://github.com/D-Robotics/palm_detection_mediapipe)

(https://github.com/D-Robotics/hand_landmarks_mediapipe)

算法支持的手势识别类别,以及手势类别在算法msg(Attribute成员,type为"gesture")中对应的数值如下:

手势说明数值
ThumbUp竖起大拇指2
Victory“V”手势3
Mute“嘘”手势4
Palm手掌5
OkayOK手势11
ThumbLeft大拇指向左12
ThumbRight大拇指向右13
Awesome666手势14

应用场景:手势识别算法集成了人手关键点检测,手势分析等技术,使得计算机能够将人的手势解读为对应指令,可实现手势控制以及手语翻译等功能,主要应用于智能家居,智能座舱、智能穿戴设备等领域。

小车手势控制案例:小车手势控制

支持平台

平台运行方式示例功能
RDK S100, RDK S100PUbuntu 22.04 (Humble)启动MIPI/USB摄像头,并通过web展示推理渲染结果

算法信息

模型平台输入尺寸推理帧率(fps)
mediapipeS100224x2241114

准备工作

RDK平台

  1. RDK已烧录好Ubuntu 22.04系统镜像。

  2. RDK已成功安装TogetheROS.Bot。

  3. RDK已安装MIPI或者USB摄像头。

  4. 确认PC机能够通过网络访问RDK。

使用介绍

人手关键点检测(hand_landmarks_mediapipe)package订阅sensor package发布的图片以及人体检测和跟踪package发布的人手框检测结果,经过推理后发布算法msg,通过websocket package实现在PC端浏览器上渲染显示发布的图片和对应的算法结果。

使用MIPI摄像头发布图片

# 配置tros.b环境
source /opt/tros/humble/setup.bash
# 从tros.b的安装路径中拷贝出运行示例需要的配置文件。
cp -r /opt/tros/${TROS_DISTRO}/lib/palm_detection_mediapipe/config/ .
cp -r /opt/tros/${TROS_DISTRO}/lib/hand_landmarks_mediapipe/config/ .

# 配置MIPI摄像头
export CAM_TYPE=mipi

# 启动launch文件
ros2 launch hand_landmarks_mediapipe hand_landmarks.launch.py

使用USB摄像头发布图片

# 配置tros.b环境
source /opt/tros/humble/setup.bash
# 从tros.b的安装路径中拷贝出运行示例需要的配置文件。
cp -r /opt/tros/${TROS_DISTRO}/lib/palm_detection_mediapipe/config/ .
cp -r /opt/tros/${TROS_DISTRO}/lib/hand_landmarks_mediapipe/config/ .

# 配置USB摄像头
export CAM_TYPE=usb

# 启动launch文件
ros2 launch hand_landmarks_mediapipe hand_landmarks.launch.py

使用本地图片回灌

# 配置tros.b环境
source /opt/tros/humble/setup.bash
# 从tros.b的安装路径中拷贝出运行示例需要的配置文件。
cp -r /opt/tros/${TROS_DISTRO}/lib/palm_detection_mediapipe/config/ .
cp -r /opt/tros/${TROS_DISTRO}/lib/hand_landmarks_mediapipe/config/ .

# 配置本地回灌图片
export CAM_TYPE=fb

# 启动launch文件
ros2 launch hand_landmarks_mediapipe hand_landmarks.launch.py publish_image_source:=config/example.jpg publish_image_format:=jpg publish_output_image_w:=640 publish_output_image_h:=480

结果分析

在运行终端输出如下信息:

[palm_detection_mediapipe-4] [DNN]: 3.7.3_(4.2.11 HBRT)
[hand_landmarks_mediapipe-3] [WARN] [1757389272.651945922] [mono2d_hand_lmk]: Get model name: hand_224_224 from load model.
[palm_detection_mediapipe-4] [WARN] [1757389272.653466536] [mono2d_palm_det]: Get model name: palm_det_192_192 from load model.
[palm_detection_mediapipe-4] [WARN] [1757389272.657688231] [mono2d_palm_det]: Enabling zero-copy
[palm_detection_mediapipe-4] [WARN] [1757389272.657755005] [mono2d_palm_det]: Create hbmem_subscription with topic_name: /hbmem_img
[hand_landmarks_mediapipe-3] [WARN] [1757389272.658734823] [mono2d_hand_lmk]: Enabling zero-copy
[hand_landmarks_mediapipe-3] [WARN] [1757389272.658829973] [mono2d_hand_lmk]: Create hbmem_subscription with topic_name: /hbmem_img
[hand_landmarks_mediapipe-3] [WARN] [1757389272.679073504] [mono2d_hand_lmk]: Loaned messages are only safe with const ref subscription callbacks. If you are using any other kind of subscriptions, set the ROS_DISABLE_LOANED_MESSAGES environment variable to 1 (the default).
[palm_detection_mediapipe-4] [WARN] [1757389272.679083479] [mono2d_palm_det]: Loaned messages are only safe with const ref subscription callbacks. If you are using any other kind of subscriptions, set the ROS_DISABLE_LOANED_MESSAGES environment variable to 1 (the default).
[hand_landmarks_mediapipe-3] [WARN] [1757389272.679384552] [mono2d_hand_lmk]: SharedMemImgProcess Recved img encoding: nv12, h: 480, w: 640, step: 640, index: 0, stamp: 1757389272_411007134, data size: 460800, comm delay [268.3575]ms
[palm_detection_mediapipe-4] [WARN] [1757389272.679384452] [mono2d_palm_det]: SharedMemImgProcess Recved img encoding: nv12, h: 480, w: 640, step: 640, index: 0, stamp: 1757389272_411007134, data size: 460800, comm delay [268.3576]ms
[hand_landmarks_mediapipe-3] [WARN] [1757389273.715343396] [mono2d_hand_lmk]: input fps: 13.58, out fps: 13.94, infer time ms: 71, post process time ms: 1
[palm_detection_mediapipe-4] [WARN] [1757389273.723452363] [mono2d_palm_det]: input fps: 13.59, out fps: 13.94, infer time ms: 71, post process time ms: 0
[palm_detection_mediapipe-4] [WARN] [1757389275.711869066] [mono2d_palm_det]: SharedMemImgProcess Recved img encoding: nv12, h: 480, w: 640, step: 640, index: 33, stamp: 1757389275_710984298, data size: 460800, comm delay [0.8785]ms
[hand_landmarks_mediapipe-3] [WARN] [1757389275.711873416] [mono2d_hand_lmk]: SharedMemImgProcess Recved img encoding: nv12, h: 480, w: 640, step: 640, index: 33, stamp: 1757389275_710984298, data size: 460800, comm delay [0.8835]ms
[hobot_codec_republish-2] [WARN] [1757389277.211724002] [hobot_codec_decoder]: Pub img fps [9.66]
[hand_landmarks_mediapipe-3] [WARN] [1757389278.811834846] [mono2d_hand_lmk]: SharedMemImgProcess Recved img encoding: nv12, h: 480, w: 640, step: 640, index: 64, stamp: 1757389278_810957877, data size: 460800, comm delay [0.8710]ms

输出log显示,程序运行成功,初始化完成之后单次的推理耗时0.87ms。

在PC端的浏览器输入http://IP:8000 即可查看图像和算法渲染效果(IP为RDK的IP地址):