Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Davide Rua Carneiro

2021

A Conversational Interface for interacting with Machine Learning models

Authors
Carneiro, D; Veloso, P; Guimarães, M; Baptista, J; Sousa, M;

Publication
Proceedings of 4th International Workshop on eXplainable and Responsible AI and Law co-located with 18th International Conference on Artificial Intelligence and Law (ICAIL 2021), Virtual Event, Sao Paolo, Brazil, June 21, 2021.

Abstract

2023

Comparison of Supervised Learning Algorithms for Quality Assessment of Wearable Electrocardiograms With Paroxysmal Atrial Fibrillation

Authors
Huerta, A; Martinez, A; Carneiro, D; Bertomeu González, V; Rieta, JJ; Alcaraz, R;

Publication
IEEE ACCESS

Abstract
Emerging wearable technology able to monitor electrocardiogram (ECG) continuously for long periods of time without disrupting the patient's daily life represents a great opportunity to improve suboptimal current diagnosis of paroxysmal atrial fibrillation (AF). However, its integration into clinical practice is still limited because the acquired ECG recording is often strongly contaminated by transient noise, thus leading to numerous false alarms of AF and requiring manual interpretation of extensive amounts of ECG data. To improve this situation, automated selection of ECG segments with sufficient quality for precise diagnosis has been widely proposed, and numerous algorithms for such ECG quality assessment can be found. Although most have reported successful performance on ECG signals acquired from healthy subjects, only a recent algorithm based on a well-known pre-trained convolutional neural network (CNN), such as AlexNet, has maintained a similar efficiency in the context of paroxysmal AF. Hence, having in mind the latest major advances in the development of neural networks, the main goal of this work was to compare the most recent pre-trained CNN models in terms of classification performance between high- and low-quality ECG excerpts and computational time. In global values, all reported a similar classification performance, which was significantly superior than the one provided by previous methods based on combining hand-crafted ECG features with conventional machine learning classifiers. Nonetheless, shallow networks (such as AlexNet) trended to detect better high-quality ECG excerpts and deep CNN models to identify better noisy ECG segments. The networks with a moderate depth of about 20 layers presented the best balanced performance on both groups of ECG excerpts. Indeed, GoogLeNet (with a depth of 22 layers) obtained very close values of sensitivity and specificity about 87%. It also maintained a misclassification rate of AF episodes similar to AlexNet and an acceptable computation time, thus constituting the best alternative for quality assessment of wearable, long-term ECG recordings acquired from patients with paroxysmal AF.

2023

The Impact of Data Selection Strategies on Distributed Model Performance

Authors
Guimarães, M; Oliveira, F; Carneiro, D; Novais, P;

Publication
Ambient Intelligence - Software and Applications - 14th International Symposium on Ambient Intelligence, ISAmI 2023, Guimarães, Portugal, July 12-14, 2023

Abstract
Distributed Machine Learning, in which data and learning tasks are scattered across a cluster of computers, is one of the answers of the field to the challenges posed by Big Data. Still, in an era in which data abounds, decisions must still be made regarding which specific data to use on the training of the model, either because the amount of available data is simply too large, or because the training time or complexity of the model must be kept low. Typical approaches include, for example, selection based on data freshness. However, old data are not necessarily outdated and might still contain relevant patterns. Likewise, relying only on recent data may significantly decrease data diversity and representativity, and decrease model quality. The goal of this paper is to compare different heuristics for selecting data in a distributed Machine Learning scenario. Specifically, we ascertain whether selecting data based on their characteristics (meta-features), and optimizing for maximum diversity, improves model quality while, eventually, allowing to reduce model complexity. This will allow to develop more informed data selection strategies in distributed settings, in which the criteria are not only the location of the data or the state of each node in the cluster, but also include intrinsic and relevant characteristics of the data. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

2023

Speculative Computation: Application Scenarios

Authors
Ramos, J; Oliveira, T; Carneiro, D; Satoh, K; Novais, P;

Publication
Handbook of Abductive Cognition

Abstract
Artificial intelligence and machine learning have been widely applied in several areas with the twofold goal of improving people’s well-being and accelerating computational processes. This may be seen in medical assistance (e.g., automatic verification of MRI images) and in personal assistants that adapt the content to the user based on his/her preferences, to optimize query response times in relational databases and accelerate the information retrieval process. Most of machine learning algorithms used need a dataset to train on, so that the resulting models can be used, for example, to predict a value or enable user-specific results. Considering predictive methods, when new data arrives, a new training of the model may be needed. Speculative computation is a machine learning subfield that seeks to enable computation to be one step ahead of the user by speculating the value that will be received to be computed. A change in the environment may affect the execution, but the adjustments are rapidly performed. This paper intends to provide an overview of the field of speculative computation, describing its main characteristics and advantages, and different scenarios of the medical field in which it is applied. It also provides a critical and comparative analysis with other machine learning methods and a description of how to apply different algorithms to create better systems. © Springer Nature Switzerland AG 2023.

2023

Block size, parallelism and predictive performance: finding the sweet spot in distributed learning

Authors
Oliveira, F; Carneiro, D; Guimaraes, M; Oliveira, O; Novais, P;

Publication
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS

Abstract
As distributed and multi-organization Machine Learning emerges, new challenges must be solved, such as diverse and low-quality data or real-time delivery. In this paper, we use a distributed learning environment to analyze the relationship between block size, parallelism, and predictor quality. Specifically, the goal is to find the optimum block size and the best heuristic to create distributed Ensembles. We evaluated three different heuristics and five block sizes on four publicly available datasets. Results show that using fewer but better base models matches or outperforms a standard Random Forest, and that 32 MB is the best block size.

2023

Selection of Replicas with Predictions of Resources Consumption

Authors
Monteiro, J; Oliveira, Ó; Carneiro, D;

Publication
Lecture Notes in Networks and Systems

Abstract

  • 13
  • 22