7.4.2.1 RDK S100 LLM Toolchain

RDK S100 1.0.0 Large Model Toolchain

On the RDK S100/S100P platform, D-Robotics_LLM_S100 currently supports the following models and features:

LLM

DeepSeek-R1-Distill-Qwen: Supports DeepSeek-R1-Distill-Qwen-1.5B and DeepSeek-R1-Distill-Qwen-7B, providing model quantization, simple conversation, multi-turn dialogue, and PPL evaluation features.
InternLM2: Supports InternLM2-1.8B, providing model quantization, simple conversation, and PPL evaluation features.
Qwen2.5: Supports Qwen2.5-1.5B, Qwen2.5-7B, Qwen2.5-1.5B-Instruct, and Qwen2.5-7B-Instruct, providing model quantization, simple conversation, multi-turn dialogue (Instruct only), and PPL evaluation features.

Multimodal

Qwen2.5-Omni: Supports Qwen2.5-Omni-3B, providing model quantization, offline operation, and online operation features.

D-Robotics_LLM_S100 Development Toolkit

wget https://d-robotics-aitoolchain.oss-cn-beijing.aliyuncs.com/llm_s100/1.0.0/D-Robotics_LLM_S100_1.0.0_SDK.tar.gz

D-Robotics_LLM_S100 User Manual

wget https://d-robotics-aitoolchain.oss-cn-beijing.aliyuncs.com/llm_s100/1.0.0/D-Robotics_LLM_S100_1.0.0_Doc.zip

D-Robotics_LLM_S100 Compiled Models

After downloading the development toolkit, check the oellm_runtime/model/resolve_model_nash-m.txt file for the download link.

Test Development Board: S100P.
Performance Data Acquisition: Test a single prompt, capturing TTFT (Time To First Token) and TPS (Tokens Per Second) metrics.
Python Version: Python 3.10.
Runtime Environment: Linux.

model	platform	dtype	seqlen	max context	TTFT(ms)	TPS	memory(GB)
DeepSeek-R1-Distill-Qwen-1.5B	S100P	q8	256	1024	109	27.08	1.7
DeepSeek-R1-Distill-Qwen-1.5B	S100P	q4	256	1024	108	39.49	1.1
DeepSeek-R1-Distill-Qwen-1.5B	S100P	q8	256	4096	226	23.80	1.8
DeepSeek-R1-Distill-Qwen-1.5B	S100P	q4	256	4096	224	32.35	1.2
DeepSeek-R1-Distill-Qwen-7B	S100P	q8	256	1024	544	6.76	7.4

model	platform	dtype	seqlen	max context	TTFT(ms)	TPS	memory(GB)
InternLM2-1.8B	S100P	q8	256	1024	132	23.83	1.8

model	platform	dtype	seqlen	max context	TTFT(ms)	TPS	memory(GB)
Qwen2.5-1.5B	S100P	q8	256	1024	130	24.04	1.8
Qwen2.5-1.5B-Instruct	S100P	q8	256	1024	130	24.40	1.8
Qwen2.5-7B	S100P	q8	256	1024	535	6.67	7.4
Qwen2.5-7B-Instruct	S100P	q8	256	1024	534	6.75	7.4

model	platform	dtype	seqlen	max context	TTFT(ms)	TPS	memory(GB)
Qwen2.5-Omni-3B	S100P	q8	256	2048	285	14.03	5.5