Face Recognition ROS and bug fixes (#219)

Pavlos-Tosidis · passalis · web-flow · commit c5615eed820c · 2022-02-11T11:50:22.000+02:00
* This commit contains:
1. face_recognition.py script and perception README.md update
2. updated benchmarking_demo.py to correctly measure GPU memory allocation
3. Added torch setting in model_mobilenet.py for faster CPU inference
4. Sped up utils.py pair generation script
5. Fixed a minor bug in face_recognition_learner.py

* This commit contains:
1. face_recognition.py script and perception README.md update
2. updated benchmarking_demo.py to correctly measure GPU memory allocation
3. Added torch setting in model_mobilenet.py for faster CPU inference
4. Sped up utils.py pair generation script
5. Fixed a minor bug in face_recognition_learner.py
6. Added create_new flag on face_recognition_learner.fit_reference()

* Update docs/reference/face-recognition.md

Co-authored-by: Nikolaos Passalis &lt;passalis@users.noreply.github.com&gt;

* Update benchmarking_demo.py

* Update benchmarking_demo.py

Co-authored-by: Nikolaos Passalis &lt;passalis@users.noreply.github.com&gt;
diff --git a/docs/reference/face-recognition.md b/docs/reference/face-recognition.md
@@ -152,7 +152,7 @@ When 'ExternalDataset' is of type 'ImageFolder', images should be placed in a de
 
 #### `FaceRecognitionLearner.fit_reference`
 ```python
-FaceRecognitionLearner.fit_reference(self, path, save_path)
+FaceRecognitionLearner.fit_reference(self, path, save_path, create_new)
 ```
 
 This method is used to create a reference database to be used in inference when mode='backbone_only'.
@@ -164,6 +164,8 @@ Parameters:
   Path containing the reference images. If a reference database was already created can be left blank.
 - **save_path**: *str, default=None*\
   Path to save (load if already created) the .pkl reference file.
+- **create_new**: *bool, default=True*\
+  Whether to create a new or load an existing .pkl reference file.
 
 **Notes**
 
diff --git a/projects/opendr_ws/src/perception/README.md b/projects/opendr_ws/src/perception/README.md
@@ -23,49 +23,50 @@ rosrun perception image_dataset.py
 By default, it downloads a `nano_MOT20` dataset from OpenDR's FTP server and uses it to publish data to the ROS topic. You can create an instance of this node with any `DatasetIterator` object that returns `(Image, Target)` as elements.
 
 ## Pose Estimation ROS Node
-Assuming that you have already [built your workspace](../../README.md) and started roscore (i.e., just run `roscore`), then you can
-
-
-1. Add OpenDR to `PYTHONPATH` (please make sure you do not overwrite `PYTHONPATH` ), e.g.,
-```shell
-export PYTHONPATH="/home/user/opendr/src:$PYTHONPATH"
-```
+Assuming that you have already [activated the OpenDR environment](../../../../docs/reference/installation.md), [built your workspace](../../README.md) and started roscore (i.e., just run `roscore`), then you can
 
-2. Start the node responsible for publishing images. If you have a usb camera, then you can use the corresponding node (assuming you have installed the corresponding package):
+1. Start the node responsible for publishing images. If you have a usb camera, then you can use the corresponding node (assuming you have installed the corresponding package):
 
 ```shell
 rosrun usb_cam usb_cam_node
 ```
 
-3. You are then ready to start the pose detection node
+2. You are then ready to start the pose detection node
 
 ```shell
 rosrun perception pose_estimation.py
 ```
 
-4. You can examine the annotated image stream using `rqt_image_view` (select the topic `/opendr/image_pose_annotated`) or
+3. You can examine the annotated image stream using `rqt_image_view` (select the topic `/opendr/image_pose_annotated`) or
    `rostopic echo /opendr/poses`
 
 ## Face Recognition ROS Node
-Assuming that you have already [built your workspace](../../README.md) and started roscore (i.e., just run `roscore`), then you can
+Assuming that you have already [activated the OpenDR environment](../../../../docs/reference/installation.md), [built your workspace](../../README.md) and started roscore (i.e., just run `roscore`), then you can
 
 
-1. Add OpenDR to `PYTHONPATH` (please make sure you do not overwrite `PYTHONPATH` ), e.g.,
-```shell
-export PYTHONPATH="/home/user/opendr/src:$PYTHONPATH"
-```
-
-2. Start the node responsible for publishing images. If you have a usb camera, then you can use the corresponding node (assuming you have installed the corresponding package):
+1. Start the node responsible for publishing images. If you have a usb camera, then you can use the corresponding node (assuming you have installed the corresponding package):
 
 ```shell
 rosrun usb_cam usb_cam_node
 ```
 
-3. You are then ready to start the face recognition node
+2. You are then ready to start the face recognition node. Note that you should pass the folder containing the images of known faces as argument to create the corresponding database of known persons.
 
 ```shell
-rosrun perception face_recognition.py
+rosrun perception face_recognition.py _database_path:='./database'
 ```
+**Notes**
+
+Reference images should be placed in a defined structure like:
+- imgs
+    - ID1
+      - image1
+      - image2
+    - ID2
+    - ID3
+    - ...
+
+Τhe name of the sub-folder, e.g. ID1, will be published under `/opendr/face_recognition_id`.
 
 4. The database entry and the returned confidence is published under the topic name `/opendr/face_recognition`, and the human-readable ID
 under `/opendr/face_recognition_id`.
diff --git a/projects/opendr_ws/src/perception/scripts/face_recognition.py b/projects/opendr_ws/src/perception/scripts/face_recognition.py
@@ -20,13 +20,18 @@
 from std_msgs.msg import String
 from sensor_msgs.msg import Image as ROS_Image
 from opendr_bridge import ROSBridge
+
 from opendr.perception.face_recognition import FaceRecognitionLearner
+from opendr.perception.object_detection_2d import RetinaFaceLearner
+from opendr.perception.object_detection_2d.datasets.transforms import BoundingBoxListToNumpyArray
 
 
 class FaceRecognitionNode:
 
-    def __init__(self, input_image_topic="/usb_cam/image_raw", face_recognition_topic="/opendr/face_recognition",
-                 face_id_topic="/opendr/face_recognition_id", database_path="./database", device="cuda",
+    def __init__(self, input_image_topic="/usb_cam/image_raw",
+                 face_recognition_topic="/opendr/face_recognition",
+                 face_id_topic="/opendr/face_recognition_id",
+                 database_path="./database", device="cuda",
                  backbone='mobilefacenet'):
         """
         Creates a ROS Node for face recognition
@@ -42,6 +47,18 @@ def __init__(self, input_image_topic="/usb_cam/image_raw", face_recognition_topi
         :type device: str
         """
 
+        # Initialize the face recognizer
+        self.recognizer = FaceRecognitionLearner(device=device, mode='backbone_only', backbone=backbone)
+        self.recognizer.download(path=".")
+        self.recognizer.load(".")
+        self.recognizer.fit_reference(database_path, save_path=".", create_new=True)
+
+        # Initialize the face detector
+        self.face_detector = RetinaFaceLearner(backbone='mnet', device=device)
+        self.face_detector.download(path=".", verbose=True)
+        self.face_detector.load("retinaface_{}".format('mnet'))
+        self.class_names = ["face", "masked_face"]
+
         if face_recognition_topic is not None:
             self.face_publisher = rospy.Publisher(face_recognition_topic, ObjectHypothesis, queue_size=10)
         else:
@@ -52,61 +69,54 @@ def __init__(self, input_image_topic="/usb_cam/image_raw", face_recognition_topi
         else:
             self.face_id_publisher = None
 
-        rospy.Subscriber(input_image_topic, ROS_Image, self.callback)
-
         self.bridge = ROSBridge()
-
-        # Initialize the pose estimation
-        self.recognizer = FaceRecognitionLearner(device=device, mode='backbone_only', backbone=backbone)
-        self.recognizer.download(path=".")
-        self.recognizer.load(".")
-        self.recognizer.fit_reference(database_path, save_path=".")
-
-    def listen(self):
-        """
-        Start the node and begin processing input data
-        """
-        rospy.init_node('opendr_face_recognition', anonymous=True)
-        rospy.loginfo("Face recognition node started!")
-        rospy.spin()
+        rospy.Subscriber(input_image_topic, ROS_Image, self.callback)
 
     def callback(self, data):
         """
         Callback that process the input data and publishes to the corresponding topics
         :param data: input message
         :type data: sensor_msgs.msg.Image
         """
-
         # Convert sensor_msgs.msg.Image into OpenDR Image
         image = self.bridge.from_ros_image(data)
         image = image.opencv()
 
-        # Run face recognition
+        # Run face detection and recognition
         if image is not None:
-            result = self.recognizer.infer(image)
-            if result.data is not None:
-                if self.face_publisher is not None:
-                    ros_face = self.bridge.to_ros_face(result)
-                    self.face_publisher.publish(ros_face)
-
-                if self.face_id_publisher is not None:
-                    ros_face_id = self.bridge.to_ros_face_id(result)
-                    self.face_id_publisher.publish(ros_face_id.data)
-
-            else:
-                result.description = "Unknown"
-                if self.face_publisher is not None:
-                    ros_face = self.bridge.to_ros_face(result)
-                    self.face_publisher.publish(ros_face)
-
-                if self.face_id_publisher is not None:
-                    ros_face_id = self.bridge.to_ros_face_id(result)
-                    self.face_id_publisher.publish(ros_face_id.data)
-
-        # We get can the data back using self.bridge.from_ros_face(ros_face)
-        # e.g.
-        # face = self.bridge.from_ros_face(ros_face)
-        # face.description = self.recognizer.database[face.id][0]
+            bounding_boxes = self.face_detector.infer(image)
+            if bounding_boxes:
+                bounding_boxes = BoundingBoxListToNumpyArray()(bounding_boxes)
+                boxes = bounding_boxes[:, :4]
+                for idx, box in enumerate(boxes):
+                    (startX, startY, endX, endY) = int(box[0]), int(box[1]), int(box[2]), int(box[3])
+                    img = image[startY:endY, startX:endX]
+                    result = self.recognizer.infer(img)
+
+                    if result.data is not None:
+                        if self.face_publisher is not None:
+                            ros_face = self.bridge.to_ros_face(result)
+                            self.face_publisher.publish(ros_face)
+
+                        if self.face_id_publisher is not None:
+                            ros_face_id = self.bridge.to_ros_face_id(result)
+                            self.face_id_publisher.publish(ros_face_id.data)
+
+                    else:
+                        result.description = "Unknown"
+                        if self.face_publisher is not None:
+                            ros_face = self.bridge.to_ros_face(result)
+                            self.face_publisher.publish(ros_face)
+
+                        if self.face_id_publisher is not None:
+                            ros_face_id = self.bridge.to_ros_face_id(result)
+                            self.face_id_publisher.publish(ros_face_id.data)
+
+                # We get can the data back using self.bridge.from_ros_face(ros_face)
+                # e.g.
+                # face = self.bridge.from_ros_face(ros_face)
+                # face.description = self.recognizer.database[face.id][0]
+
 
 if __name__ == '__main__':
     # Select the device for running the
@@ -120,5 +130,19 @@ def callback(self, data):
     except:
         device = 'cpu'
 
-    face_recognition_node = FaceRecognitionNode(device=device)
-    face_recognition_node.listen()
+    # initialize ROS node
+    rospy.init_node('opendr_face_recognition', anonymous=True)
+    rospy.loginfo("Face recognition node started!")
+
+    # get network backbone
+    backbone = rospy.get_param("~backbone", "mobilefacenet")
+    input_image_topic = rospy.get_param("~input_image_topic", "/usb_cam/image_raw")
+    database_path = rospy.get_param('~database_path', './')
+    rospy.loginfo("Using backbone: {}".format(backbone))
+    assert backbone in ["mobilefacenet", "ir_50"], "backbone should be one of ['mobilefacenet', 'ir_50']"
+
+    face_recognition_node = FaceRecognitionNode(device=device, backbone=backbone,
+                                                input_image_topic=input_image_topic,
+                                                database_path=database_path)
+    # begin ROS communications
+    rospy.spin()
diff --git a/projects/perception/face_recognition/demos/benchmarking_demo.py b/projects/perception/face_recognition/demos/benchmarking_demo.py
@@ -25,40 +25,46 @@
     parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
     parser.add_argument("--backbone", help="Backbone to use (mobilefacenet, ir_50)", type=str, default='mobilefacenet')
     parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
+    parser.add_argument("--create_new", help="Whether to create or load a database", type=bool, default=True)
     args = parser.parse_args()
 
     onnx, device, backbone = args.onnx, args.device, args.backbone
-
+    nvml = False
+    try:
+        if 'cuda' in device:
+            from pynvml import nvmlInit, nvmlDeviceGetMemoryInfo, nvmlDeviceGetHandleByIndex
+            nvmlInit()
+            nvml = True
+    except ImportError:
+        print('You can install pynvml to also monitor the allocated GPU memory')
+        pass
+    if nvml:
+        info_before = nvmlDeviceGetMemoryInfo(nvmlDeviceGetHandleByIndex(0))
+        info_before = info_before.used / 1024 ** 2
     recognizer = FaceRecognitionLearner(device=device, backbone=backbone, mode='backbone_only')
     recognizer.download(path=".")
     recognizer.load(path=".")
-
+    if nvml:
+        info_after = nvmlDeviceGetMemoryInfo(nvmlDeviceGetHandleByIndex(0))
+        info_after = info_after.used / 1024 ** 2
     # Download one sample image
     recognizer.download(path=".", mode="test_data")
-    recognizer.fit_reference("./test_data", ".")
+    recognizer.fit_reference("./test_data", ".", create_new=args.create_new)
     image_path = join(".", "test_data", "images", "1", "1.jpg")
     img = cv2.imread(image_path)
 
     if onnx:
         recognizer.optimize()
 
     fps_list = []
+
     print("Benchmarking...")
-    for i in tqdm(range(50)):
+    for i in tqdm(range(100)):
         start_time = time.perf_counter()
         # Perform inference
         result = recognizer.infer(img)
         end_time = time.perf_counter()
         fps_list.append(1.0 / (end_time - start_time))
     print("Average FPS: %.2f" % (np.mean(fps_list)))
-
-    # If pynvml is available, try to get memory stats for cuda
-    try:
-        if 'cuda' in device:
-            from pynvml import nvmlInit, nvmlDeviceGetMemoryInfo, nvmlDeviceGetHandleByIndex
-
-            nvmlInit()
-            info = nvmlDeviceGetMemoryInfo(nvmlDeviceGetHandleByIndex(0))
-            print("Memory allocated: %.2f MB " % (info.used / 1024 ** 2))
-    except ImportError:
-        pass
+    if nvml:
+        print("Memory allocated: %.2f MB " % (info_after - info_before))
diff --git a/projects/perception/face_recognition/demos/inference_demo.py b/projects/perception/face_recognition/demos/inference_demo.py
@@ -32,7 +32,7 @@
     recognizer.load(".")
 
     recognizer.download(path=".", mode="test_data")
-    recognizer.fit_reference(path=join(".", "test_data", "images"), save_path=".")
+    recognizer.fit_reference(path=join(".", "test_data", "images"), save_path=".", create_new=True)
     image_path = join(".", "test_data", "images", "1", "1.jpg")
     img = Image.open(image_path)
 
diff --git a/projects/perception/face_recognition/demos/inference_tutorial.ipynb b/projects/perception/face_recognition/demos/inference_tutorial.ipynb
@@ -117,7 +117,7 @@
     "id": "Ms6Xj8GDYt5C"
    },
    "source": [
-    "When we have the sample images, we need to create a reference database of our known people:"
+    "When we have the sample images, we need to create a reference database of our known people (if we already created one we can set 'create_new=False' to load the existing database):"
    ]
   },
   {
@@ -137,7 +137,7 @@
    ],
    "source": [
     "from os.path import join\n",
-    "recognizer.fit_reference(path=join(\".\", \"test_data\", \"images\"), save_path=\".\")"
+    "recognizer.fit_reference(path=join(\".\", \"test_data\", \"images\"), save_path=\".\", create_new=True)"
    ]
   },
   {
diff --git a/src/opendr/perception/face_recognition/algorithm/backbone/model_mobilenet.py b/src/opendr/perception/face_recognition/algorithm/backbone/model_mobilenet.py
@@ -1,4 +1,9 @@
 from torch import nn
+import torch
+import warnings
+
+warnings.simplefilter("ignore", UserWarning)
+torch.set_flush_denormal(True)
 
 
 def _make_divisible(v, divisor, min_value=None):
diff --git a/src/opendr/perception/face_recognition/algorithm/util/utils.py b/src/opendr/perception/face_recognition/algorithm/util/utils.py
@@ -184,7 +184,8 @@ def hflip_batch(imgs_tensor):
 
 
 def ccrop_batch(imgs_tensor):
-    ccropped_imgs = torch.empty_like(imgs_tensor)
+    ccropped_imgs = torch.empty((len(imgs_tensor),
+                                 imgs_tensor[0].shape[0], imgs_tensor[0].shape[1], imgs_tensor[0].shape[2]))
     for i, img_ten in enumerate(imgs_tensor):
         ccropped_imgs[i] = ccrop(img_ten)
 
@@ -247,14 +248,12 @@ def perform_val_imagefolder(device, embedding_size, batch_size, backbone, carray
                     transforms.Normalize(mean=[0.5, 0.5, 0.5],
                                          std=[0.5, 0.5, 0.5])])
                 for i in range(len(image1)):
-                    image1[i] = np.array(transform(Image.open(image1[i])))
-                    image2[i] = np.array(transform(Image.open(image2[i])))
+                    image1[i] = transform(Image.open(image1[i]))
+                    image2[i] = transform(Image.open(image2[i]))
                     if labels[i] == 'True':
                         pairs.append(True)
                     else:
                         pairs.append(False)
-                image1 = torch.tensor(image1)
-                image2 = torch.tensor(image2)
                 if tta:
                     ccropped = ccrop_batch(image1)
                     fliped = hflip_batch(ccropped)
diff --git a/src/opendr/perception/face_recognition/face_recognition_learner.py b/src/opendr/perception/face_recognition/face_recognition_learner.py

Original file line number	Diff line number	Diff line change
`@@ -117,7 +117,7 @@`
`117`	`117`	`"id": "Ms6Xj8GDYt5C"`
`118`	`118`	`},`
`119`	`119`	`"source": [`
`120`		`- "When we have the sample images, we need to create a reference database of our known people:"`
	`120`	`+ "When we have the sample images, we need to create a reference database of our known people (if we already created one we can set 'create_new=False' to load the existing database):"`
`121`	`121`	`]`
`122`	`122`	`},`
`123`	`123`	`{`
`@@ -137,7 +137,7 @@`
`137`	`137`	`],`
`138`	`138`	`"source": [`
`139`	`139`	`"from os.path import join\n",`
`140`		`- "recognizer.fit_reference(path=join(\".\", \"test_data\", \"images\"), save_path=\".\")"`
	`140`	`+ "recognizer.fit_reference(path=join(\".\", \"test_data\", \"images\"), save_path=\".\", create_new=True)"`
`141`	`141`	`]`
`142`	`142`	`},`
`143`	`143`	`{`
-Original file line number
+Diff line change
         return {'Training_statistics': results, 'Evaluation_statistics': eval_results}
 -    def fit_reference(self, path=None, save_path=None):
 +    def fit_reference(self, path=None, save_path=None, create_new=True):
         """
         Implementation to create reference database. Provided with a path with reference images and a save_path,
         it creates a .pkl file containing the features of the reference images, and saves it
         :type path: str
         :param save_path: path to save the .pkl file
         :type save_path: str
 +        :param create_new: create new reference.pkl file. If false it will attempt to load an existing reference.pkl
 +         file from save_path.
 +        :type create_new: bool, default=True
         """
         if self._model is None and self.ort_backbone_session is None:
             raise UserWarning('A model should be loaded first')
 -        if os.path.exists(os.path.join(save_path, 'reference.pkl')):
 +        if os.path.exists(os.path.join(save_path, 'reference.pkl')) and not create_new:
             print('Loading Reference')
             self.database = pickle.load(open(os.path.join(save_path, 'reference.pkl'), "rb"))
         else:
                 transforms.Normalize(mean=self.rgb_mean, std=self.rgb_std)]
+            )
 -            dataset_eval = datasets.ImageFolder(os.path.join(dataset.path, dataset.dataset_type), eval_transform)
 +            dataset_eval = datasets.ImageFolder(dataset.path, eval_transform)
             eval_loader = torch.utils.data.DataLoader(
                 dataset_eval, batch_size=self.batch_size, pin_memory=self.pin_memory,
                 num_workers=self.num_workers, drop_last=self.drop_last