Shape Instance Detector (SID)

Authors: ROBERT CUPEC, IVAN VIDOVIĆ, DAMIR FILKO i PETRA PEJIĆ

Faculty: Faculty of Electrical Engineering, Computer Science and Information Technology Osijek

Country: Croatia

e-mail: dackar@ptfos.hr

SID is an algorithm which enables recognition of one or multiple target objects in images captured by a 3D camera, such as Microsoft Kinect, Asus Xtion Pro Live, Orbbec Astra, Intel RealSense etc. As the result, it returns the information about presence and 3D pose of all target objects in an input 3D image. The algorithm requires 3D models of target objects in the form of triangular mesh. It can recognize objects which are placed on a flat horizontal surface in any pose, even if multiple objects are present on the scene and the target objects are occluded.

SID is an algorithm which enables recognition of one or multiple target objects in images captured by a 3D camera, such as Microsoft Kinect, Asus Xtion Pro Live, Orbbec Astra, Intel RealSense etc. The algorithm requires 3D models of target objects in the form of triangular mesh. It consists of three main steps: (1) detection of planar and convex surfaces in the input 3D image; (2) computing Convex Template Instance (CTI) descriptor for every convex surface; (3) generating hypotheses about presence of target objects in the image in a particular pose; (4) Hypothesis evaluation and rejection of low-probability hypotheses. The hypotheses which remain after this procedure represent the final result of the algorithm – a scene interpretation in the form of a set of data structures, each representing an object on the scene described by its identifier and 3D pose with respect to the camera reference frame. The hypothesis generation is based on the alignment of the convex hulls of the target object model with the detected convex surfaces. The hypothesis evaluation is performed in three levels. At the first level, the model CTI descriptors are compared to the scene CTI descriptors. The hypotheses with low similarity between CTI descriptors are rejected. At the second level, for each remaining hypothesis, the target object 3D model is projected onto the image, a scene fitting score is computed and the hypotheses with a low score are rejected. A similar criterion is used at the third level, but before computing the score, the model is precisely aligned with the overlapping part of the scene using the ICP algorithm. Finally, the hypotheses are ranked according to their scene fitting scores and a greedy search algorithm is used to reject hypotheses which overlap with other higher ranked hypotheses. The proposed algorithm can recognize objects which are placed on a flat horizontal surface in any pose, even if multiple objects are present on the scene and the target objects are occluded. The algorithm is presented in the scientific paper Robert Cupec, Ivan Vidović, Damir Filko, Petra Pejić, Object recognition based on convex hull alignment, published in a highly ranked international scientific journal Pattern Recognition (doi.org/10.1016/j.patcog.2020.107199).