Publicacoes - INESC TEC

Publicações

Publicações por CRIIS

2021

A Comparative Analysis for 2D Object Recognition: A Case Study with Tactode Puzzle-Like Tiles

Autores
Silva, D; Sousa, A; Costa, V;

Publicação
JOURNAL OF IMAGING

Abstract
Object recognition represents the ability of a system to identify objects, humans or animals in images. Within this domain, this work presents a comparative analysis among different classification methods aiming at Tactode tile recognition. The covered methods include: (i) machine learning with HOG and SVM; (ii) deep learning with CNNs such as VGG16, VGG19, ResNet152, MobileNetV2, SSD and YOLOv4; (iii) matching of handcrafted features with SIFT, SURF, BRISK and ORB; and (iv) template matching. A dataset was created to train learning-based methods (i and ii), and with respect to the other methods (iii and iv), a template dataset was used. To evaluate the performance of the recognition methods, two test datasets were built: tactode_small and tactode_big, which consisted of 288 and 12,000 images, holding 2784 and 96,000 regions of interest for classification, respectively. SSD and YOLOv4 were the worst methods for their domain, whereas ResNet152 and MobileNetV2 showed that they were strong recognition methods. SURF, ORB and BRISK demonstrated great recognition performance, while SIFT was the worst of this type of method. The methods based on template matching attained reasonable recognition results, falling behind most other methods. The top three methods of this study were: VGG16 with an accuracy of 99.96% and 99.95% for tactode_small and tactode_big, respectively; VGG19 with an accuracy of 99.96% and 99.68% for the same datasets; and HOG and SVM, which reached an accuracy of 99.93% for tactode_small and 99.86% for tactode_big, while at the same time presenting average execution times of 0.323 s and 0.232 s on the respective datasets, being the fastest method overall. This work demonstrated that VGG16 was the best choice for this case study, since it minimised the misclassifications for both test datasets.

FecharLer Abstract

2021

Visible and Thermal Image-Based Trunk Detection with Deep Learning for Forestry Mobile Robotics

Autores
da Silva, DQ; dos Santos, FN; Sousa, AJ; Filipe, V;

Publicação
JOURNAL OF IMAGING

Abstract
Mobile robotics in forests is currently a hugely important topic due to the recurring appearance of forest wildfires. Thus, in-site management of forest inventory and biomass is required. To tackle this issue, this work presents a study on detection at the ground level of forest tree trunks in visible and thermal images using deep learning-based object detection methods. For this purpose, a forestry dataset composed of 2895 images was built and made publicly available. Using this dataset, five models were trained and benchmarked to detect the tree trunks. The selected models were SSD MobileNetV2, SSD Inception-v2, SSD ResNet50, SSDLite MobileDet and YOLOv4 Tiny. Promising results were obtained; for instance, YOLOv4 Tiny was the best model that achieved the highest AP (90%) and F1 score (89%). The inference time was also evaluated, for these models, on CPU and GPU. The results showed that YOLOv4 Tiny was the fastest detector running on GPU (8 ms). This work will enhance the development of vision perception systems for smarter forestry robots.

FecharLer Abstract

2021

Autonomous Robot Visual-Only Guidance in Agriculture Using Vanishing Point Estimation

Autores
Sarmento, J; Aguiar, AS; dos Santos, FN; Sousa, AJ;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021)

Abstract
Autonomous navigation in agriculture is very challenging as it usually takes place outdoors where there is rough terrain, uncontrolled natural lighting, constantly changing organic scenarios and sometimes the absence of a Global Navigation Satellite System (GNSS). In this work, a single camera and a Google coral dev Board Edge Tensor Processing Unit (TPU) setup is proposed to navigate among a woody crop, more specifically a vineyard. The guidance is provided by estimating the vanishing point and observing its position with respect to the central frame, and correcting the steering angle accordingly. The vanishing point is estimated by object detection using Deep Learning (DL) based Neural Networks (NN) to obtain the position of the trunks in the image. The NN's were trained using Transfer Learning (TL), which requires a smaller dataset than conventional training methods. For this purpose, a dataset with 4221 images was created considering image collection, annotation and augmentation procedures. Results show that our framework can detect the vanishing point with an average of the absolute error of 0.52. and can be considered for autonomous steering.

FecharLer Abstract

2021

Robot navigation in vineyards based on the visual vanish point concept

Autores
Sarmento, J; Aguiar, AS; Santos, FND; Sousa, AJ;

Publicação
2021 International Symposium of Asian Control Association on Intelligent Robotics and Industrial Automation, IRIA 2021

Abstract
Autonomous navigation in agriculture is very challenging as it usually takes place outdoors where there is rough terrain, uncontrolled natural lighting, constantly changing organic scenarios and sometimes the absence of Global Navigation Satellite System (GNSS) signal. In this work, a monocular visual system is proposed to estimate angular orientation and navigate between woody crops, more specifically a vineyard, using a Proportional Integrative Derivative (PID)-based controller. The guidance is provided by combining two ways to find the center of the vineyard: First, by estimating the vanishing point and second, by averaging the position of the two closest base trunk detections. Then, by the monocular angle perception, the angular error is determined. For obtaining the trunk position in the image, object detection using Deep Learning (DL) based Neural Networks (NN) is used. To evaluate the proposed controller, a visual vineyard simulation is created using Gazebo. The proposed joint controller is able to travel along a simulated straight vineyard with an RMS error of 1.17 cm. Moreover, a simulated curved vineyard modeled after the Douro region is tested in this work, where the robot was able to steer with an RMS error of 7.28 cm. © 2021 IEEE.

FecharLer Abstract

2021

Unimodal and Multimodal Perception for Forest Management: Review and Dataset

Autores
da Silva, DQ; dos Santos, FN; Sousa, AJ; Filipe, V; Boaventura Cunha, J;

Publicação
COMPUTATION

Abstract
Robotics navigation and perception for forest management are challenging due to the existence of many obstacles to detect and avoid and the sharp illumination changes. Advanced perception systems are needed because they can enable the development of robotic and machinery solutions to accomplish a smarter, more precise, and sustainable forestry. This article presents a state-of-the-art review about unimodal and multimodal perception in forests, detailing the current developed work about perception using a single type of sensors (unimodal) and by combining data from different kinds of sensors (multimodal). This work also makes a comparison between existing perception datasets in the literature and presents a new multimodal dataset, composed by images and laser scanning data, as a contribution for this research field. Lastly, a critical analysis of the works collected is conducted by identifying strengths and research trends in this domain.

FecharLer Abstract

2021

Robust human position estimation in cooperative robotic cells

Autores
Amorim, A; Guimares, D; Mendona, T; Neto, P; Costa, P; Moreira, AP;

Publicação
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING

Abstract
Robots are increasingly present in our lives, sharing the workspace and tasks with human co-workers. However, existing interfaces for human-robot interaction / cooperation (HRI/C) have limited levels of intuitiveness to use and safety is a major concern when humans and robots share the same workspace. Many times, this is due to the lack of a reliable estimation of the human pose in space which is the primary input to calculate the human-robot minimum distance (required for safety and collision avoidance) and HRI/C featuring machine learning algorithms classifying human behaviours / gestures. Each sensor type has its own characteristics resulting in problems such as occlusions (vision) and drift (inertial) when used in an isolated fashion. In this paper, it is proposed a combined system that merges the human tracking provided by a 3D vision sensor with the pose estimation provided by a set of inertial measurement units (IMUs) placed in human body limbs. The IMUs compensate the gaps in occluded areas to have tracking continuity. To mitigate the lingering effects of the IMU offset we propose a continuous online calculation of the offset value. Experimental tests were designed to simulate human motion in a human-robot collaborative environment where the robot moves away to avoid unexpected collisions with de human. Results indicate that our approach is able to capture the human's position, for example the forearm, with a precision in the millimetre range and robustness to occlusions.

FecharLer Abstract