Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

Ricardo P. M. Cruz is an Assistant Professor at the Faculty of Engineering of the University of Porto and a researcher at INESC TEC. His work focuses on machine learning, particularly deep learning, and computer vision. He received a B.Sc. degree in Computer Science (2012), an M.Sc. degree in Mathematical Engineering (2015), both from the University of Porto, and a Ph.D. degree in Computer Science (2021) jointly from the University of Porto, Aveiro and Minho. His topics cover transversal aspects of machine learning with applications to health and autonomous driving, detailed in over 20 publications with 100+ citations.

Interest
Topics
Details

Details

  • Name

    Ricardo Pereira Cruz
  • Role

    External Research Collaborator
  • Since

    01st October 2013
001
Publications

2025

CNN explanation methods for ordinal regression tasks

Authors
Barbero-Gómez, J; Cruz, RPM; Cardoso, JS; Gutiérrez, PA; Hervás-Martínez, C;

Publication
NEUROCOMPUTING

Abstract
The use of Convolutional Neural Network (CNN) models for image classification tasks has gained significant popularity. However, the lack of interpretability in CNN models poses challenges for debugging and validation. To address this issue, various explanation methods have been developed to provide insights into CNN models. This paper focuses on the validity of these explanation methods for ordinal regression tasks, where the classes have a predefined order relationship. Different modifications are proposed for two explanation methods to exploit the ordinal relationships between classes: Grad-CAM based on Ordinal Binary Decomposition (GradOBDCAM) and Ordinal Information Bottleneck Analysis (OIBA). The performance of these modified methods is compared to existing popular alternatives. Experimental results demonstrate that GradOBD-CAM outperforms other methods in terms of interpretability for three out of four datasets, while OIBA achieves superior performance compared to IBA.

2025

Learning Ordinality in Semantic Segmentation

Authors
Cruz, RPM; Cristino, R; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
Semantic segmentation consists of predicting a semantic label for each image pixel. While existing deep learning approaches achieve high accuracy, they often overlook the ordinal relationships between classes, which can provide critical domain knowledge (e.g., the pupil lies within the iris, and lane markings are part of the road). This paper introduces novel methods for spatial ordinal segmentation that explicitly incorporate these inter-class dependencies. By treating each pixel as part of a structured image space rather than as an independent observation, we propose two regularization terms and a new metric to enforce ordinal consistency between neighboring pixels. Two loss regularization terms and one metric are proposed for structural ordinal segmentation, which penalizes predictions of non-ordinal adjacent classes. Five biomedical datasets and multiple configurations of autonomous driving datasets demonstrate the efficacy of the proposed methods. Our approach achieves improvements in ordinal metrics and enhances generalization, with up to a 15.7% relative increase in the Dice coefficient. Importantly, these benefits come without additional inference time costs. This work highlights the significance of spatial ordinal relationships in semantic segmentation and provides a foundation for further exploration in structured image representations.

2024

Active Supervision: Human in the Loop

Authors
Cruz, RPM; Shihavuddin, ASM; Maruf, MH; Cardoso, JS;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
After the learning process, certain types of images may not be modeled correctly because they were not well represented in the training set. These failures can then be compensated for by collecting more images from the real-world and incorporating them into the learning process - an expensive process known as active learning. The proposed twist, called active supervision, uses the model itself to change the existing images in the direction where the boundary is less defined and requests feedback from the user on how the new image should be labeled. Experiments in the context of class imbalance show the technique is able to increase model performance in rare classes. Active human supervision helps provide crucial information to the model during training that the training set lacks.

2024

YOLOMM - You Only Look Once for Multi-modal Multi-tasking

Authors
Campos, F; Cerqueira, FG; Cruz, RPM; Cardoso, JS;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
Autonomous driving can reduce the number of road accidents due to human error and result in safer roads. One important part of the system is the perception unit, which provides information about the environment surrounding the car. Currently, most manufacturers are using not only RGB cameras, which are passive sensors that capture light already in the environment but also Lidar. This sensor actively emits laser pulses to a surface or object and measures reflection and time-of-flight. Previous work, YOLOP, already proposed a model for object detection and semantic segmentation, but only using RGB. This work extends it for Lidar and evaluates performance on KITTI, a public autonomous driving dataset. The implementation shows improved precision across all objects of different sizes. The implementation is entirely made available: https://github.com/filipepcampos/yolomm.

2024

Condition Invariance for Autonomous Driving by Adversarial Learning

Authors
Silva, DTE; Cruz, RPM;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
Object detection is a crucial task in autonomous driving, where domain shift between the training and the test set is one of the main reasons behind the poor performance of a detector when deployed. Some erroneous priors may be learned from the training set, therefore a model must be invariant to conditions that might promote such priors. To tackle this problem, we propose an adversarial learning framework consisting of an encoder, an object-detector, and a condition-classifier. The encoder is trained to deceive the condition-classifier and aid the object-detector as much as possible throughout the learning stage, in order to obtain highly discriminative features. Experiments showed that this framework is not very competitive regarding the trade-off between precision and recall, but it does improve the ability of the model to detect smaller objects and some object classes.

Supervised
thesis

2023

Uncertainty-Driven Out-of-Distribution Detection in 3D LiDAR Object Detection for Autonomous Driving

Author
José António Barbosa da Fonseca Guerra

Institution
UP-FEUP

2023

Introducing Domain Knowledge to Scene Parsing in Autonomous Driving

Author
Rafael Valente Cristino

Institution
UP-FEUP