YOLO-NAS-DeepSparse.cpp

YOLO-NAS is a state-of-the-art object detector by Deci AI. This project implements the YOLO-NAS object detector in C++ with DeepSparse backend to speed up inference performance. DeepSparse is an inference runtime by Neural Magic that can greatly speed up inference performance on CPUs by leveraging sparsity.

Features

Supports both image and video inference.
Faster CPU inference speeds.

Getting Started

The following instructions demonstrates how to build this project on a Linux system. Windows is currently not supported by the DeepSparse library.

Prerequisites

CMake v3.8+ - found at https://cmake.org/
GCC/G++ compiler - found at https://gcc.gnu.org/
Python 3.8+ - Python is used to install the deepsparse library which is required for the build. Download here.
OpenCV v4.0+ - Download here.
DeepSparse v1.6.0+ - Download here.

Building the project

Set the OpenCV_DIR environment variable to point to your ../../opencv/build directory (if not set).

Run the following build commands: a. [Linux] Bash:

cd <yolo-nas-deepsparse-cpp-directory>
cmake -S. -Bbuild -DCMAKE_BUILD_TYPE=Release
cd build

make

The compiled executable will be in root folder of the build directory.

Inference

Export the ONNX file:

from super_gradients.training import models

model = models.get("yolo_nas_s", pretrained_weights="coco")
model.eval()
model.prep_model_for_conversion(input_size=(1, 3, 640, 640))
models.convert_to_onnx(model=model, prep_model_for_conversion_kwargs={"input_size":(1, 3, 640, 640)}, out_path="yolo_nas_s.onnx")

To run the inference, execute the following command:

yolo-nas-deepsparse-cpp --model <ONNX_MODEL_PATH> [-i <IMAGE_PATH> | -v <VIDEO_PATH>] [--imgsz IMAGE_SIZE] [--gpu] [--iou-thresh IOU_THRESHOLD] [--score-thresh CONFIDENCE_THRESHOLD]

Benchmarks

The following benchmarks were done on Google Colab using Intel® Xeon® Processor E5-2699 v4 @ 2.20GHz with 2 vCPUs.

Backend	Latency	FPS	Implementation
PyTorch	867.02ms	1.15	Native (`model.predict()` in `super_gradients`)
ONNX C++ (via OpenCV DNN)	962.27ms	1.04	Hyuotu
ONNX Python	626.37ms	1.59	Hyuotu
OpenVINO C++	628.04ms	1.59	Y-T-G
DeepSparse C++	565.75ms	1.83	Y-T-G

Authors

Mohammed Yasin - @Y-T-G

Acknowledgements

Thanks to @Hyuto for his work on ONNX implementation of YOLO-NAS in C++ which was utilized in this project.

License

This project is licensed under the MIT License - see the LICENSE file for details. DeepSparse Community edition is only for evaluation, research, and non-production. See the DeepSparse Community License for more details.

YOLO-NAS-DeepSparse.cpp

YOLO-NAS-DeepSparse.cpp Link to this heading

Features Link to this heading

Getting Started Link to this heading

Prerequisites Link to this heading

Building the project Link to this heading

Inference Link to this heading

Benchmarks Link to this heading

Authors Link to this heading

Acknowledgements Link to this heading

License Link to this heading

YOLO-NAS-DeepSparse.cpp

Features

Getting Started

Prerequisites

Building the project

Inference

Benchmarks

Authors

Acknowledgements

License