2015
Authors
Garcia Laencina, PJ; Abreu, PH; Abreu, MH; Afonoso, N;
Publication
COMPUTERS IN BIOLOGY AND MEDICINE
Abstract
Breast cancer is the most frequently diagnosed cancer in women. Using historical patient information stored in clinical datasets, data mining and machine learning approaches can be applied to predict the survival of breast cancer patients. A common drawback is the absence of information, i.e., missing data, in certain clinical trials. However, most standard prediction methods are not able to handle incomplete samples and, then, missing data imputation is a widely applied approach for solving this inconvenience. Therefore, and taking into account the characteristics of each breast cancer dataset, it is required to perform a detailed analysis to determine the most appropriate imputation and prediction methods in each clinical environment This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information (most clinical data of the patients are incomplete), which is a challenge in terms of complexity. Four scenarios are evaluated: (I) 5-year survival prediction without imputation and 5-year survival prediction from cleaned dataset with (II) Mode imputation, (Ill) Expectation-Maximization imputation and (IV) K-Nearest Neighbors imputation. Prediction models for breast cancer survivability are constructed using four different methods: K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines. Experiments are performed in a nested ten-fold cross-validation procedure and, according to the obtained results, the best results are provided by the K-Nearest Neighbors algorithm: more than 81% of accuracy and more than 0.78 of area under the Receiver Operator Characteristic curve, which constitutes very good results in this complex scenario.
2019
Authors
Pereira, R; Abreu, P; Polisciuc, E; Machado, P;
Publication
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL 3: IVAPP
Abstract
Automatic Identification System data has been used in several studies with different directions like traffic forecasting, pollution control or anomalous behavior detection in vessels trajectories. Considering this last subject, the intersection between vessels is often related with abnormal behaviors, but this topic has not been exploited yet. In this paper an approach to assist the domain experts in the task of analyzing these intersections is introduced, based on data processing and visualization. The work was experimented with a proprietary dataset that covers the Portuguese maritime zone, containing an average of 6460 intersections by day. The results show that several intersections were only noticeable with the visualization strategies here proposed. Copyright
2016
Authors
Gomes, A; Correia, FB; Abreu, PH;
Publication
2016 IEEE Frontiers in Education Conference, FIE 2015, Eire, PA, USA, October 12-15, 2016
Abstract
High failure and dropout rates are common in higher education institutions with introductory programming courses. Some researchers advocate that sometimes teachers don't use correct methods of assessment and that many students pass in programming without knowing how to program. In this paper authors describe the assessment methodology applied to a first year, first semester, Biomedical Engineering programming course (2015/2016). Students' programming skills were tested by playing a game in the first class, then they were assessed with three tests and a final exam, each with topics the authors considered fundamental for the students to master. A correlation analyses between the different types of tests and exam questions is done, to evaluate the most suitable, for assessing programming knowledge, showing that it is possible to use different question types as a pedagogical strategy, to assess student difficulty levels and programming skills, that help students acquire abstract, reasoning and algorithm thinking in an acceptable level. Also, it is shown that different forms of questions are equivalent to assess equal knowledge and that it is possible to predict the ability of a student to program at an early stage.
2020
Authors
Amorim, JP; Abreu, PH; Reyes, M; Santos, J;
Publication
Proceedings of the International Joint Conference on Neural Networks
Abstract
Saliency maps have been used as one possibility to interpret deep neural networks. This method estimates the relevance of each pixel in the image classification, with higher values representing pixels which contribute positively to classification.The goal of this study is to understand how the complexity of the network affects the interpretabilty of the saliency maps in classification tasks. To achieve that, we investigate how changes in the regularization affects the saliency maps produced, and their fidelity to the overall classification process of the network.The experimental setup consists in the calculation of the fidelity of five saliency map methods that were compare, applying them to models trained on the CIFAR-10 dataset, using different levels of weight decay on some or all the layers.Achieved results show that models with lower regularization are statistically (significance of 5%) more interpretable than the other models. Also, regularization applied only to the higher convolutional layers or fully-connected layers produce saliency maps with more fidelity. © 2020 IEEE.
2014
Authors
Abreu, PH; Amaro, H; Silva, DC; Machado, P; Abreu, MH;
Publication
IFMBE Proceedings
Abstract
The prediction of overall survival in patients has an important role, especially in diseases with a high mortality rate. Encompassed in this reality, patients with oncological diseases, particularly the more frequent ones like woman breast cancer, can take advantage of a very good customization, which in some cases may even lead to a disease-free life. In order to achieve this customization, in this work a comparison between three algorithms (evolutionary, hierarchical and k-medoids) is proposed. After constructing a database with more than 800 breast cancer patients from a single oncology center with 15 clinical variables (heterogeneous data) and having 25% of the data missing, which illustrates a real clinical scenario, the algorithms were used to group similar patients into clusters. Using Tukey's HSD (Honestly Significant Difference) test, from both comparison between k-medoids and the other two approaches (evolutionary and hierarchical clustering) a statistical difference were detected (p- value < 0.0000001) as well as for the other comparison (evolutionary versus hierarchical clustering) - p-value = 0.0061354 - for a significance level of 95%. The future work will consist primarily in dealing with the missing data, in order to achieve better results in future prediction. © 2014, Springer International Publishing Switzerland.
2018
Authors
Santos, MS; Abreu, PH; Rodriguez Bermudez, G; Garcia Laencina, PJ;
Publication
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS
Abstract
Brain-Computer Interface systems based on motor imagery are able to identify an individual's intent to initiate control through the classification of encephalography patterns. Correctly classifying such patterns is instrumental and strongly depends in a robust machine learning block that is able to properly process the features extracted from a subject's encephalograms. The main objective of this work is to provide an overall view on machine learning stages, aiming to answer the following question: "What are the steps in the classification process that we should worry about?". The obtained results suggest that future research in the field should focus on two main aspects: exploring techniques for dimensionality reduction, in particular, supervised linear approaches, and evaluating adequate validation schemes to allow a more precise interpretation of results.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.