2023
Authors
Baghcheband, H; Soares, C; Reis, LP;
Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I
Abstract
In recent years, the increasing availability of distributed data has led to a growing interest in transfer learning across multiple nodes. However, local data may not be adequate to learn sufficiently accurate models, and the problem of learning from multiple distributed sources remains a challenge. To address this issue, Machine Learning Data Markets (MLDM) have been proposed as a potential solution. In MLDM, autonomous agents exchange relevant data in a cooperative relationship to improve their models. Previous research has shown that data exchange can lead to better models, but this has only been demonstrated with only two agents. In this paper, we present an extended evaluation of a simple version of the MLDM framework in a collaborative scenario. Our experiments show that data exchange has the potential to improve learning performance, even in a simple version of MLDM. The findings conclude that there exists a direct correlation between the number of agents and the gained performance, while an inverse correlation was observed between the performance and the data batch sizes. The results of this study provide important insights into the effectiveness of MLDM and how it can be used to improve learning performance in distributed systems. By increasing the number of agents, a more efficient system can be achieved, while larger data batch sizes can decrease the global performance of the system. These observations highlight the importance of considering both the number of agents and the data batch sizes when designing distributed learning systems using the MLDM framework.
2023
Authors
dos Santos, MR; de Carvalho, ACPLF; Soares, C;
Publication
CoRR
Abstract
2024
Authors
Cerqueira, V; Moniz, N; Inácio, R; Soares, C;
Publication
CoRR
Abstract
2024
Authors
Lopes, TRS; Roberto, GF; Soares, C; Tosta, TAA; Silva, AB; Loyola, AM; Cardoso, SV; de Faria, PR; do Nascimento, MZ; Neves, LA;
Publication
Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2024, Volume 2: VISAPP, Rome, Italy, February 27-29, 2024.
Abstract
In this work, a method based on the use of explainable artificial intelligence techniques with multiscale and multidimensional fractal techniques is presented in order to investigate histological images stained with Hematoxylin-Eosin. The CNN GoogLeNet neural activation patterns were explored, obtained from the gradient-weighted class activation mapping and locally-interpretable model-agnostic explanation techniques. The feature vectors were generated with multiscale and multidimensional fractal techniques, specifically fractal dimension, lacunarity and percolation. The features were evaluated by ranking each entry, using the ReliefF algorithm. The discriminative power of each solution was defined via classifiers with different heuristics. The best results were obtained from LIME, with a significant increase in accuracy and AUC rates when compared to those provided by GoogLeNet. The details presented here can contribute to the development of models aimed at the classification of histological images. © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
2021
Authors
Soares, C; Torgo, L;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
2024
Authors
Baghcheband, H; Soares, C; Reis, LP;
Publication
FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2024
Abstract
Data valuation, the process of assigning value to data based on its utility and usefulness, is a critical and largely unexplored aspect of data markets. Within the Machine Learning Data Market (MLDM), a platform that enables data exchange among multiple agents, the challenge of quantifying the value of data becomes particularly prominent. Agents within MLDM are motivated to exchange data based on its potential impact on their individual performance. Shapley Value-based methods have gained traction in addressing this challenge, prompting our study to investigate their effectiveness within the MLDM context. Specifically, we propose the Gain Data Shapley Value (GDSV) method tailored for MLDM and compare it to the original data valuation method used in MLDM. Our analysis focuses on two common learning algorithms, Decision Tree (DT) and K-nearest neighbors (KNN), within a simulated society of five agents, tested on 45 classification datasets. results show that the GDSV leads to incremental improvements in predictive performance across both DT and KNN algorithms compared to performance-based valuation or the baseline. These findings underscore the potential of Shapley Value-based methods in identifying high-value data within MLDM while indicating areas for further improvement.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.