felixhjh
diff --git a/‎docs/api/vision_results/README.md
+3-1 b/‎docs/api/vision_results/README.md
+3-1
diff --git a/‎docs/api/vision_results/face_detection_result.md
+34 b/‎docs/api/vision_results/face_detection_result.md
+34
diff --git a/‎docs/api/vision_results/matting_result.md
+35 b/‎docs/api/vision_results/matting_result.md
+35
diff --git a/‎examples/vision/README.md
+5-4 b/‎examples/vision/README.md
+5-4
diff --git a/‎examples/vision/classification/paddleclas/python/infer.py
+2-1 b/‎examples/vision/classification/paddleclas/python/infer.py
+2-1
diff --git a/‎examples/vision/detection/README.md
+9-8 b/‎examples/vision/detection/README.md
+9-8
diff --git a/‎examples/vision/detection/nanodet_plus/README.md
+8-3 b/‎examples/vision/detection/nanodet_plus/README.md
+8-3
diff --git a/‎examples/vision/detection/nanodet_plus/cpp/README.md
+10-7 b/‎examples/vision/detection/nanodet_plus/cpp/README.md
+10-7
diff --git a/‎examples/vision/detection/nanodet_plus/cpp/infer.cc
+109 b/‎examples/vision/detection/nanodet_plus/cpp/infer.cc
+109
diff --git a/‎examples/vision/detection/nanodet_plus/python/README.md
+9-7 b/‎examples/vision/detection/nanodet_plus/python/README.md
+9-7
@@ -4,5 +4,7 @@ FastDeploy根据视觉模型的任务类型，定义了不同的结构体(`csrcs
 
 | 结构体 | 文档 | 说明 | 相应模型 |
 | :----- | :--- | :---- | :------- |
-| ClassificationResult | [C++/Python文档](./classificiation_result.md) | 图像分类返回结果 | ResNet50、MobileNetV3等 |
+| ClassificationResult | [C++/Python文档](./classification_result.md) | 图像分类返回结果 | ResNet50、MobileNetV3等 |
 | DetectionResult | [C++/Python文档](./detection_result.md) | 目标检测返回结果 | PPYOLOE、YOLOv7系列模型等 |
+| FaceDetectionResult | [C++/Python文档](./face_detection_result.md) | 目标检测返回结果 | PPYOLOE、YOLOv7系列模型等 |
+| MattingResult | [C++/Python文档](./matting_result.md) | 目标检测返回结果 | PPYOLOE、YOLOv7系列模型等 |
@@ -0,0 +1,34 @@
+# FaceDetectionResult 人脸检测结果
+
+FaceDetectionResult 代码定义在`csrcs/fastdeploy/vision/common/result.h`中，用于表明图像检测出来的目标框、目标类别和目标置信度。
+
+## C++ 结构体
+
+`fastdeploy::vision::FaceDetectionResult`
+
+```
+struct FaceDetectionResult {
+  std::vector<std::array<float, 4>> boxes;
+  std::vector<std::array<float, 2>> landmarks;
+  std::vector<float> scores;
+  ResultType type = ResultType::FACE_DETECTION;
+  int landmarks_per_face;
+  void Clear();
+  std::string Str();
+};
+```
+
+- **boxes**: 成员变量，表示单张图片检测出来的所有目标框坐标，`boxes.size()`表示框的个数，每个框以4个float数值依次表示xmin, ymin, xmax, ymax， 即左上角和右下角坐标
+- **scores**: 成员变量，表示单张图片检测出来的所有目标置信度，其元素个数与`boxes.size()`一致
+- **landmarks**: 成员变量，表示单张图片检测出来的所有人脸的关键点，其元素个数与`boxes.size()`一致
+- **landmarks_per_face**: 成员变量，表示每个人脸框中的关键点的数量。
+- **Clear()**: 成员函数，用于清除结构体中存储的结果
+- **Str()**: 成员函数，将结构体中的信息以字符串形式输出（用于Debug）
+
+## Python结构体
+
+`fastdeploy.vision.FaceDetectionResult`
+
+- **boxes**(list of list(float)): 成员变量，表示单张图片检测出来的所有目标框坐标。boxes是一个list，其每个元素为一个长度为4的list， 表示为一个框，每个框以4个float数值依次表示xmin, ymin, xmax, ymax， 即左上角和右下角坐标
+- **scores**(list of float): 成员变量，表示单张图片检测出来的所有目标置信度
+- **landmarks**: 成员变量，表示单张图片检测出来的所有人脸的关键点
@@ -0,0 +1,35 @@
+# MattingResult 抠图结果
+
+MattingResult 代码定义在`csrcs/fastdeploy/vision/common/result.h`中，用于表明图像检测出来的目标框、目标类别和目标置信度。
+
+## C++ 结构体
+
+`fastdeploy::vision::MattingResult`
+
+```
+struct MattingResult {
+  std::vector<float> alpha;       // h x w
+  std::vector<float> foreground;  // h x w x c (c=3 default)
+  std::vector<int64_t> shape;
+  bool contain_foreground = false;
+  void Clear();
+  std::string Str();
+};
+```
+
+- **alpha**: 是一维向量，为预测的alpha透明度的值，值域为[0.,1.]，长度为hxw，h,w为输入图像的高和宽
+- **foreground**: 是一维向量，为预测的前景，值域为[0.,255.]，长度为hxwxc，h,w为输入图像的高和宽，c一般为3，foreground不是一定有的，只有模型本身预测了前景，这个属性才会有效
+- **contain_foreground**: 表示预测的结果是否包含前景
+- **shape**: 表示输出结果的shape，当contain_foreground为false，shape只包含(h,w)，当contain_foreground为true，shape包含(h,w,c), c一般为3
+- **Clear()**: 成员函数，用于清除结构体中存储的结果
+- **Str()**: 成员函数，将结构体中的信息以字符串形式输出（用于Debug）
+
+
+## Python结构体
+
+`fastdeploy.vision.MattingResult`
+
+- **alpha**: 是一维向量，为预测的alpha透明度的值，值域为[0.,1.]，长度为hxw，h,w为输入图像的高和宽
+- **foreground**: 是一维向量，为预测的前景，值域为[0.,255.]，长度为hxwxc，h,w为输入图像的高和宽，c一般为3，foreground不是一定有的，只有模型本身预测了前景，这个属性才会有效
+- **contain_foreground**: 表示预测的结果是否包含前景
+- **shape**: 表示输出结果的shape，当contain_foreground为false，shape只包含(h,w)，当contain_foreground为true，shape包含(h,w,c), c一般为3
@@ -4,10 +4,11 @@
 
 | 任务类型           | 说明                                  | 预测结果结构体                                                                          |
 |:-------------- |:----------------------------------- |:-------------------------------------------------------------------------------- |
-| Detection      | 目标检测，输入图像，检测图像中物体位置，并返回检测框坐标及类别和置信度 | [DetectionResult](../../../../docs/api/vision_results/detection_result.md)       |
-| Segmentation   | 语义分割，输入图像，给出图像中每个像素的分类及置信度          | [SegmentationResult](../../../../docs/api/vision_results/segmentation_result.md) |
-| Classification | 图像分类，输入图像，给出图像的分类结果和置信度             | [ClassifyResult](../../../../docs/api/vision_results/classification_result.md)   |
-
+| Detection      | 目标检测，输入图像，检测图像中物体位置，并返回检测框坐标及类别和置信度 | [DetectionResult](../../docs/api/vision_results/detection_result.md)       |
+| Segmentation   | 语义分割，输入图像，给出图像中每个像素的分类及置信度          | [SegmentationResult](../../docs/api/vision_results/segmentation_result.md) |
+| Classification | 图像分类，输入图像，给出图像的分类结果和置信度             | [ClassifyResult](../../docs/api/vision_results/classification_result.md)   |
+| FaceDetection | 人脸检测，输入图像，检测图像中人脸位置，并返回检测框坐标及人脸关键点             | [FaceDetectionResult](../../docs/api/vision_results/face_detection_result.md)   |
+| Matting | 抠图，输入图像，返回图片的前景每个像素点的Alpha值            | [MattingResult](../../docs/api/vision_results/matting_result.md)   |
 ## FastDeploy API设计
 
 视觉模型具有较有统一任务范式，在设计API时（包括C++/Python），FastDeploy将视觉模型的部署拆分为四个步骤
 
@@ -43,6 +43,7 @@ def build_option(args):
 
 # 配置runtime，加载模型
 runtime_option = build_option(args)
+
 model_file = os.path.join(args.model, "inference.pdmodel")
 params_file = os.path.join(args.model, "inference.pdiparams")
 config_file = os.path.join(args.model, "inference_cls.yaml")
@@ -51,5 +52,5 @@ def build_option(args):
 
 # 预测图片分类结果
 im = cv2.imread(args.image)
-result = model.predict(im, args.topk)
+result = model.predict(im.copy(), args.topk)
 print(result)
@@ -1,13 +1,14 @@
-# 目标检测模型
+人脸检测模型
 
 FastDeploy目前支持如下目标检测模型部署
 
 | 模型 | 说明 | 模型格式 | 版本 |
 | :--- | :--- | :------- | :--- |
-| [PaddleDetection/PPYOLOE](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | PPYOLOE系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [PaddleDetection/PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | PicoDet系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [PaddleDetection/YOLOX](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | Paddle版本的YOLOX系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [PaddleDetection/YOLOv3](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | YOLOv3系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [PaddleDetection/PPYOLO](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | PPYOLO系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [PaddleDetection/FasterRCNN](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/ppyoloe) | FasterRCNN系列模型 | Paddle | [Release/2.4](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4) |
-| [WongKinYiu/YOLOv7](https://github.com/WongKinYiu/yolov7) | YOLOv7、YOLOv7-X等系列模型 | ONNX | [v0.1](https://github.com/WongKinYiu/yolov7/tree/v0.1) |
+| [nanodet_plus](./nanodet_plus) | NanoDetPlus系列模型 | ONNX | Release/v1.0.0-alpha-1 |
+| [yolov5](./yolov5) | YOLOv5系列模型 | ONNX | Release/v6.0 |
+| [yolov5lite](./yolov5lite) | YOLOv5-Lite系列模型 | ONNX | Release/v1.4 |
+| [yolov6](./yolov6) | YOLOv6系列模型 | ONNX | Release/0.1.0 |
+| [yolov7](./yolov7) | YOLOv7系列模型 | ONNX | Release/0.1 |
+| [yolor](./yolor) | YOLOR系列模型 | ONNX | Release/weights |
+| [yolox](./yolox) | YOLOX系列模型 | ONNX | Release/v0.1.1 |
+| [scaledyolov4](./scaledyolov4) | ScaledYOLOv4系列模型 | ONNX | CommitID:6768003 |
@@ -2,11 +2,11 @@
 
 ## 模型版本说明
 
-- NanoDetPlus部署实现来自[NanoDetPlus v1.0.0-alpha-1](https://github.com/RangiLyu/nanodet/tree/v1.0.0-alpha-1) 分支代码，基于coco的[预训练模型](https://github.com/RangiLyu/nanodet/releases/tag/v1.0.0-alpha-1)。
+- NanoDetPlus部署实现来自[NanoDetPlus](https://github.com/RangiLyu/nanodet/tree/v1.0.0-alpha-1) 的代码，基于coco的[预训练模型](https://github.com/RangiLyu/nanodet/releases/tag/v1.0.0-alpha-1)。
 
   - （1）[预训练模型](https://github.com/RangiLyu/nanodet/releases/tag/v1.0.0-alpha-1)的*.onnx可直接进行部署；
-  - （2）自己训练的模型，导出ONNX模型后，参考[详细部署教程](#详细部署文档)完成部署。
-  
+  - （2）自己训练的模型，导出ONNX模型后，参考[详细部署文档](#详细部署文档)完成部署。
+
 ## 下载预训练ONNX模型
 
 为了方便开发者的测试，下面提供了NanoDetPlus导出的各系列模型，开发者可直接下载使用。
@@ -21,3 +21,8 @@
 
 - [Python部署](python)
 - [C++部署](cpp)
+
+
+## 版本说明
+
+- 本版本文档和代码基于[NanoDetPlus v1.0.0-alpha-1](https://github.com/RangiLyu/nanodet/tree/v1.0.0-alpha-1) 编写
@@ -12,7 +12,7 @@
 ```
 mkdir build
 cd build
-wget https://xxx.tgz
+wget https://https://bj.bcebos.com/paddlehub/fastdeploy/cpp/fastdeploy-linux-x64-gpu-0.2.0.tgz
 tar xvf fastdeploy-linux-x64-0.2.0.tgz
 cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-0.2.0
 make -j
@@ -32,7 +32,7 @@ wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/0000000
 
 运行完成可视化结果如下图所示
 
-<img width="640" src="https://user-images.githubusercontent.com/67993288/183847558-abcd9a57-9cd9-4891-b09a-710963c99b74.jpg">
+<img width="640" src="https://user-images.githubusercontent.com/67993288/184301689-87ee5205-2eff-4204-b615-24c400f01323.jpg">
 
 ## NanoDetPlus C++接口
 
@@ -74,11 +74,14 @@ NanoDetPlus模型加载和初始化，其中model_file为导出的ONNX模型格
 
 ### 类成员变量
 
-> > * **size**(vector&lt;int&gt;): 通过此参数修改预处理过程中resize的大小，包含两个整型元素，表示[width, height], 默认值为[640, 640]
-> > * **padding_value**(vector&lt;float&gt;): 通过此参数可以修改图片在resize时候做填充(padding)的值, 包含三个浮点型元素, 分别表示三个通道的值, 默认值为[114, 114, 114]
-> > * **is_no_pad**(bool): 通过此参数让图片是否通过填充的方式进行resize, `is_no_pad=ture` 表示不使用填充的方式，默认值为`is_no_pad=false`
-> > * **is_mini_pad**(bool): 通过此参数可以将resize之后图像的宽高这是为最接近`size`成员变量的值, 并且满足填充的像素大小是可以被`stride`成员变量整除的。默认值为`is_mini_pad=false`
-> > * **stride**(int): 配合`stris_mini_pad`成员变量使用, 默认值为`stride=32`
+#### 预处理参数
+用户可按照自己的实际需求，修改下列预处理参数，从而影响最终的推理和部署效果
+
+> > * **size**(vector&lt;int&gt;): 通过此参数修改预处理过程中resize的大小，包含两个整型元素，表示[width, height], 默认值为[320, 320]
+> > * **padding_value**(vector&lt;float&gt;): 通过此参数可以修改图片在resize时候做填充(padding)的值, 包含三个浮点型元素, 分别表示三个通道的值, 默认值为[0, 0, 0]
+> > * **keep_ratio**(bool): 通过此参数指定resize时是否保持宽高比例不变，默认是fasle.
+> > * **reg_max**(int): GFL回归中的reg_max参数，默认是7.
+> > * **downsample_strides**(vector&lt;int&gt;): 通过此参数可以修改生成anchor的特征图的下采样倍数, 包含三个整型元素, 分别表示默认的生成anchor的下采样倍数, 默认值为[8, 16, 32, 64]
 
 - [模型介绍](../../)
 - [Python部署](../python)
 
@@ -0,0 +1,109 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "fastdeploy/vision.h"
+
+void CpuInfer(const std::string& model_file, const std::string& image_file) {
+  auto model = fastdeploy::vision::detection::NanoDetPlus(model_file);
+  if (!model.Initialized()) {
+    std::cerr << "Failed to initialize." << std::endl;
+    return;
+  }
+
+  auto im = cv::imread(image_file);
+  auto im_bak = im.clone();
+
+  fastdeploy::vision::DetectionResult res;
+  if (!model.Predict(&im, &res)) {
+    std::cerr << "Failed to predict." << std::endl;
+    return;
+  }
+  std::cout << res.Str() << std::endl;
+  auto vis_im = fastdeploy::vision::Visualize::VisDetection(im_bak, res);
+  cv::imwrite("vis_result.jpg", vis_im);
+  std::cout << "Visualized result saved in ./vis_result.jpg" << std::endl;
+}
+
+void GpuInfer(const std::string& model_file, const std::string& image_file) {
+  auto option = fastdeploy::RuntimeOption();
+  option.UseGpu();
+  auto model =
+      fastdeploy::vision::detection::NanoDetPlus(model_file, "", option);
+  if (!model.Initialized()) {
+    std::cerr << "Failed to initialize." << std::endl;
+    return;
+  }
+
+  auto im = cv::imread(image_file);
+  auto im_bak = im.clone();
+
+  fastdeploy::vision::DetectionResult res;
+  if (!model.Predict(&im, &res)) {
+    std::cerr << "Failed to predict." << std::endl;
+    return;
+  }
+  std::cout << res.Str() << std::endl;
+
+  auto vis_im = fastdeploy::vision::Visualize::VisDetection(im_bak, res);
+  cv::imwrite("vis_result.jpg", vis_im);
+  std::cout << "Visualized result saved in ./vis_result.jpg" << std::endl;
+}
+
+void TrtInfer(const std::string& model_file, const std::string& image_file) {
+  auto option = fastdeploy::RuntimeOption();
+  option.UseGpu();
+  option.UseTrtBackend();
+  option.SetTrtInputShape("images", {1, 3, 320, 320});
+  auto model =
+      fastdeploy::vision::detection::NanoDetPlus(model_file, "", option);
+  if (!model.Initialized()) {
+    std::cerr << "Failed to initialize." << std::endl;
+    return;
+  }
+
+  auto im = cv::imread(image_file);
+  auto im_bak = im.clone();
+
+  fastdeploy::vision::DetectionResult res;
+  if (!model.Predict(&im, &res)) {
+    std::cerr << "Failed to predict." << std::endl;
+    return;
+  }
+  std::cout << res.Str() << std::endl;
+
+  auto vis_im = fastdeploy::vision::Visualize::VisDetection(im_bak, res);
+  cv::imwrite("vis_result.jpg", vis_im);
+  std::cout << "Visualized result saved in ./vis_result.jpg" << std::endl;
+}
+
+int main(int argc, char* argv[]) {
+  if (argc < 4) {
+    std::cout << "Usage: infer_demo path/to/model path/to/image run_option, "
+                 "e.g ./infer_model ./nanodet-plus-m_320.onnx ./test.jpeg 0"
+              << std::endl;
+    std::cout << "The data type of run_option is int, 0: run with cpu; 1: run "
+                 "with gpu; 2: run with gpu and use tensorrt backend."
+              << std::endl;
+    return -1;
+  }
+
+  if (std::atoi(argv[3]) == 0) {
+    CpuInfer(argv[1], argv[2]);
+  } else if (std::atoi(argv[3]) == 1) {
+    GpuInfer(argv[1], argv[2]);
+  } else if (std::atoi(argv[3]) == 2) {
+    TrtInfer(argv[1], argv[2]);
+  }
+  return 0;
+}
@@ -26,7 +26,7 @@ python infer.py --model nanodet-plus-m_320.onnx --image 000000014439.jpg --devic
 
 运行完成可视化结果如下图所示
 
-<img width="640" src="https://user-images.githubusercontent.com/67993288/183847558-abcd9a57-9cd9-4891-b09a-710963c99b74.jpg">
+<img width="640" src="https://user-images.githubusercontent.com/67993288/184301689-87ee5205-2eff-4204-b615-24c400f01323.jpg">
 
 ## NanoDetPlus Python接口
 
@@ -62,12 +62,14 @@ NanoDetPlus模型加载和初始化，其中model_file为导出的ONNX模型格
 > > 返回`fastdeploy.vision.DetectionResult`结构体，结构体说明参考文档[视觉模型预测结果](../../../../../docs/api/vision_results/)
 
 ### 类成员属性
-
-> > * **size**(list[int]): 通过此参数修改预处理过程中resize的大小，包含两个整型元素，表示[width, height], 默认值为[640, 640]
-> > * **padding_value**(list[float]): 通过此参数可以修改图片在resize时候做填充(padding)的值, 包含三个浮点型元素, 分别表示三个通道的值, 默认值为[114, 114, 114]
-> > * **is_no_pad**(bool): 通过此参数让图片是否通过填充的方式进行resize, `is_no_pad=True` 表示不使用填充的方式，默认值为`is_no_pad=False`
-> > * **is_mini_pad**(bool): 通过此参数可以将resize之后图像的宽高这是为最接近`size`成员变量的值, 并且满足填充的像素大小是可以被`stride`成员变量整除的。默认值为`is_mini_pad=False`
-> > * **stride**(int): 配合`stris_mini_padide`成员变量使用, 默认值为`stride=32`
+#### 预处理参数
+用户可按照自己的实际需求，修改下列预处理参数，从而影响最终的推理和部署效果
+
+> > * **size**(list[int]): 通过此参数修改预处理过程中resize的大小，包含两个整型元素，表示[width, height], 默认值为[320, 320]
+> > * **padding_value**(list[float]): 通过此参数可以修改图片在resize时候做填充(padding)的值, 包含三个浮点型元素, 分别表示三个通道的值, 默认值为[0, 0, 0]
+> > * **keep_ratio**(bool): 通过此参数指定resize时是否保持宽高比例不变，默认是fasle.
+> > * **reg_max**(int): GFL回归中的reg_max参数，默认是7.
+> > * **downsample_strides**(list[int]): 通过此参数可以修改生成anchor的特征图的下采样倍数, 包含三个整型元素, 分别表示默认的生成anchor的下采样倍数, 默认值为[8, 16, 32, 64]