YOLO-NAS-DeepSparse.cpp
YOLO-NAS is a state-of-the-art object detector by Deci AI. This project implements the YOLO-NAS object detector in C++ with DeepSparse backend to speed up inference performance. DeepSparse is an inference runtime by Neural Magic that can greatly speed up inference performance on CPUs by leveraging sparsity.
Features
- Supports both image and video inference.
- Faster CPU inference speeds.
Getting Started
The following instructions demonstrates how to build this project on a Linux system. Windows is currently not supported by the DeepSparse library.
Prerequisites
-
CMake v3.8+ - found at https://cmake.org/
-
GCC/G++ compiler - found at https://gcc.gnu.org/
-
Python 3.8+ - Python is used to install the deepsparse library which is required for the build. Download here.
-
OpenCV v4.0+ - Download here.
-
DeepSparse v1.6.0+ - Download here.
Building the project
-
Set the
OpenCV_DIR
environment variable to point to your../../opencv/build
directory (if not set). -
Run the following build commands: a. [Linux] Bash:
cd <yolo-nas-deepsparse-cpp-directory> cmake -S. -Bbuild -DCMAKE_BUILD_TYPE=Release cd build make
-
The compiled executable will be in root folder of the build directory.
Inference
-
Export the ONNX file:
from super_gradients.training import models model = models.get("yolo_nas_s", pretrained_weights="coco") model.eval() model.prep_model_for_conversion(input_size=(1, 3, 640, 640)) models.convert_to_onnx(model=model, prep_model_for_conversion_kwargs={"input_size":(1, 3, 640, 640)}, out_path="yolo_nas_s.onnx")
-
To run the inference, execute the following command:
yolo-nas-deepsparse-cpp --model <ONNX_MODEL_PATH> [-i <IMAGE_PATH> | -v <VIDEO_PATH>] [--imgsz IMAGE_SIZE] [--gpu] [--iou-thresh IOU_THRESHOLD] [--score-thresh CONFIDENCE_THRESHOLD]
Benchmarks
The following benchmarks were done on Google Colab using Intel® Xeon® Processor E5-2699 v4 @ 2.20GHz with 2 vCPUs.
Backend | Latency | FPS | Implementation |
---|---|---|---|
PyTorch | 867.02ms | 1.15 | Native (model.predict() in super_gradients ) |
ONNX C++ (via OpenCV DNN) | 962.27ms | 1.04 | Hyuotu |
ONNX Python | 626.37ms | 1.59 | Hyuotu |
OpenVINO C++ | 628.04ms | 1.59 | Y-T-G |
DeepSparse C++ | 565.75ms | 1.83 | Y-T-G |
Authors
- Mohammed Yasin - @Y-T-G
Acknowledgements
Thanks to @Hyuto for his work on ONNX implementation of YOLO-NAS in C++ which was utilized in this project.
License
This project is licensed under the MIT License - see the LICENSE file for details. DeepSparse Community edition is only for evaluation, research, and non-production. See the DeepSparse Community License for more details.