Publications

Publications by CTM

2022

A Survey on Attention Mechanisms for Medical Applications: are we Moving Toward Better Algorithms?

Authors
Goncalves, T; Rio-Torto, I; Teixeira, LF; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
The increasing popularity of attention mechanisms in deep learning algorithms for computer vision and natural language processing made these models attractive to other research domains. In healthcare, there is a strong need for tools that may improve the routines of the clinicians and the patients. Naturally, the use of attention-based algorithms for medical applications occurred smoothly. However, being healthcare a domain that depends on high-stake decisions, the scientific community must ponder if these high-performing algorithms fit the needs of medical applications. With this motto, this paper extensively reviews the use of attention mechanisms in machine learning methods (including Transformers) for several medical applications based on the types of tasks that may integrate several works pipelines of the medical domain. This work distinguishes itself from its predecessors by proposing a critical analysis of the claims and potentialities of attention mechanisms presented in the literature through an experimental case study on medical image classification with three different use cases. These experiments focus on the integrating process of attention mechanisms into established deep learning architectures, the analysis of their predictive power, and a visual assessment of their saliency maps generated by post-hoc explanation methods. This paper concludes with a critical analysis of the claims and potentialities presented in the literature about attention mechanisms and proposes future research lines in medical applications that may benefit from these frameworks.

CloseRead Abstract

2022

Toward Vehicle Occupant-Invariant Models for Activity Characterization

Authors
Capozzi, L; Barbosa, V; Pinto, C; Pinto, JR; Pereira, A; Carvalho, PM; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
With the advent of self-driving cars and the push by large companies into fully driverless transportation services, monitoring passenger behaviour in vehicles is becoming increasingly important for several reasons, such as ensuring safety and comfort. Although several human action recognition (HAR) methods have been proposed, developing a true HAR system remains a very challenging task. If the dataset used to train a model contains a small number of actors, the model can become biased towards these actors and their unique characteristics. This can cause the model to generalise poorly when confronted with new actors performing the same actions. This limitation is particularly acute when developing models to characterise the activities of vehicle occupants, for which data sets are short and scarce. In this study, we describe and evaluate three different methods that aim to address this actor bias and assess their performance in detecting in-vehicle violence. These methods work by removing specific information about the actor from the model's features during training or by using data that is independent of the actor, such as information about body posture. The experimental results show improvements over the baseline model when evaluated with real data. On the Hanau03 Vito dataset, the accuracy improved from 65.33% to 69.41%. On the Sunnyvale dataset, the accuracy improved from 82.81% to 86.62%.

CloseRead Abstract

2022

Deep Anomaly Detection for In-Vehicle Monitoring-An Application-Oriented Review

Authors
Caetano, F; Carvalho, P; Cardoso, J;

Publication
APPLIED SCIENCES-BASEL

Abstract
Anomaly detection has been an active research area for decades, with high application potential. Recent work has explored deep learning approaches to the detection of abnormal behaviour and abandoned objects in outdoor video surveillance scenarios. The extension of this recent work to in-vehicle monitoring using solely visual data represents a relevant research opportunity that has been overlooked in the accessible literature. With the increasing importance of public and shared transportation for urban mobility, it becomes imperative to provide autonomous intelligent systems capable of detecting abnormal behaviour that threatens passenger safety. To investigate the applicability of current works to this scenario, a recapitulation of relevant state-of-the-art techniques and resources is presented, including available datasets for their training and benchmarking. The lack of public datasets dedicated to in-vehicle monitoring is addressed alongside other issues not considered in previous works, such as moving backgrounds and frequent illumination changes. Despite its relevance, similar surveys and reviews have disregarded this scenario and its specificities. This work initiates an important discussion on application-oriented issues, proposing solutions to be followed in future works, particularly synthetic data augmentation to achieve representative instances with the low amount of available sequences.

CloseRead Abstract

2022

Increased Robustness in Chest X-Ray Classification Through Clinical Report-Driven Regularization

Authors
Mata, D; Silva, W; Cardoso, JS;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2022)

Abstract
In highly regulated areas such as healthcare there is a demand for explainable and trustworthy systems that are capable of providing some sort of foundation or logical reasoning to their functionality. Therefore, deep learning applications associated with such industry are increasingly required by this sense of accountability regarding their production value. Additionally, it is of utter importance to take advantage of all possible data resources, in order to achieve a greater amount of efficiency respecting such intelligent frameworks, while maintaining a realistic medical scenario. As a way to explore this issue, we propose two models trained with information retained in chest radiographs and regularized by the associated medical reports. We argue that the knowledge extracted from the free-radiology text, in a multimodal training context, promotes more coherence, leading to better decisions and interpretability saliency maps. Our proposed approach demonstrated to be more robust than their baseline counterparts, showing better classification performances, and also ensuring more concise, consistent and less dispersed saliency maps. Our proof-of-concept experiments were done using the publicly available multimodal radiology dataset MIMIC-CXR that contains a myriad of chest X-rays and its correspondent free-text reports.

CloseRead Abstract

2022

Deep Aesthetic Assessment and Retrieval of Breast Cancer Treatment Outcomes

Authors
Silva, W; Carvalho, M; Mavioso, C; Cardoso, MJ; Cardoso, JS;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2022)

Abstract
Treatments for breast cancer have continued to evolve and improve in recent years, resulting in a substantial increase in survival rates, with approximately 80% of patients having a 10-year survival period. Given the serious that impact breast cancer treatments can have on a patient's body image, consequently affecting her self-confidence and sexual and intimate relationships, it is paramount to ensure that women receive the treatment that optimizes both survival and aesthetic outcomes. Currently, there is no gold standard for evaluating the aesthetic outcome of breast cancer treatment. In addition, there is no standard way to show patients the potential outcome of surgery. The presentation of similar cases from the past would be extremely important to manage women's expectations of the possible outcome. In this work, we propose a deep neural network to perform the aesthetic evaluation. As a proof-of-concept, we focus on a binary aesthetic evaluation. Besides its use for classification, this deep neural network can also be used to find the most similar past cases by searching for nearest neighbours in the high-semantic space before classification. We performed the experiments on a dataset consisting of 143 photos of women after conservative treatment for breast cancer. The results for accuracy and balanced accuracy showed the superior performance of our proposed model compared to the state of the art in aesthetic evaluation of breast cancer treatments. In addition, the model showed a good ability to retrieve similar previous cases, with the retrieved cases having the same or adjacent class (in the 4-class setting) and having similar types of asymmetry. Finally, a qualitative interpretability assessment was also performed to analyse the robustness and trustworthiness of the model.

CloseRead Abstract

2022

Electrocardiogram lead conversion from single-lead blindly-segmented signals

Authors
Beco, SC; Pinto, JR; Cardoso, JS;

Publication
BMC MEDICAL INFORMATICS AND DECISION MAKING

Abstract
Background The standard configuration's set of twelve electrocardiogram (ECG) leads is optimal for the medical diagnosis of diverse cardiac conditions. However, it requires ten electrodes on the patient's limbs and chest, which is uncomfortable and cumbersome. Interlead conversion methods can reconstruct missing leads and enable more comfortable acquisitions, including in wearable devices, while still allowing for adequate diagnoses. Currently, methodologies for interlead ECG conversion either require multiple reference (input) leads and/or require input signals to be temporally aligned considering the ECG landmarks. Methods Unlike the methods in the literature, this paper studies the possibility of converting ECG signals into all twelve standard configuration leads using signal segments from only one reference lead, without temporal alignment (blindly-segmented). The proposed methodology is based on a deep learning encoder-decoder U-Net architecture, which is compared with adaptations based on convolutional autoencoders and label refinement networks. Moreover, the method is explored for conversion with one single shared encoder or multiple individual encoders for each lead. Results Despite the more challenging settings, the proposed methodology was able to attain state-of-the-art level performance in multiple target leads, and both lead I and lead II seem especially suitable to convert certain sets of leads. In cross-database tests, the methodology offered promising results despite acquisition setup differences. Furthermore, results show that the presence of medical conditions does not have a considerable effect on the method's performance. Conclusions This study shows the feasibility of converting ECG signals using single-lead blindly-segmented inputs. Although the results are promising, further efforts should be devoted towards the improvement of the methodologies, especially the robustness to diverse acquisition setups, in order to be applicable to cardiac health monitoring in wearable devices and less obtrusive clinical scenarios.

CloseRead Abstract