Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Tânia Pereira

2022

Semi-Supervised Approach for EGFR Mutation Prediction on CT Images

Authors
Pinheiro, C; Silva, F; Pereira, T; Oliveira, HP;

Publication
MATHEMATICS

Abstract
The use of deep learning methods in medical imaging has been able to deliver promising results; however, the success of such models highly relies on large, properly annotated datasets. The annotation of medical images is a laborious, expensive, and time-consuming process. This difficulty is increased for the mutations status label since these require additional exams (usually biopsies) to be obtained. On the other hand, raw images, without annotations, are extensively collected as part of the clinical routine. This work investigated methods that could mitigate the labelled data scarcity problem by using both labelled and unlabelled data to improve the efficiency of predictive models. A semi-supervised learning (SSL) approach was developed to predict epidermal growth factor receptor (EGFR) mutation status in lung cancer in a less invasive manner using 3D CT scans.The proposed approach consists of combining a variational autoencoder (VAE) and exploiting the power of adversarial training, intending that the features extracted from unlabelled data to discriminate images can help in the classification task. To incorporate labelled and unlabelled images, adversarial training was used, extending a traditional variational autoencoder. With the developed method, a mean AUC of 0.701 was achieved with the best-performing model, with only 14% of the training data being labelled. This SSL approach improved the discrimination ability by nearly 7 percentage points over a fully supervised model developed with the same amount of labelled data, confirming the advantage of using such methods when few annotated examples are available.

2022

Synthesizing 3D Lung CT scans with Generative Adversarial Networks

Authors
Ferreira, A; Pereira, T; Silva, F; Vilares, AT; Da Silva, MC; Cunha, A; Oliveira, HP;

Publication
44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC 2022, Glasgow, Scotland, United Kingdom, July 11-15, 2022

Abstract
In the healthcare domain, datasets are often private and lack large amounts of samples, making it difficult to cope with the inherent patient data heterogeneity. As an attempt to mitigate data scarcity, generative models are being used due to their ability to produce new data, using a dataset as a reference. However, synthesis studies often rely on a 2D representation of data, a seriously limited form of information when it comes to lung computed tomography scans where, for example, pathologies like nodules can manifest anywhere in the organ. Here, we develop a 3D Progressive Growing Generative Adversarial Network capable of generating thoracic CT volumes at a resolution of 1283, and analyze the model outputs through a quantitative metric (3D Muli-Scale Structural Similarity) and a Visual Turing Test. Clinical relevance - This paper is a novel application of the 3D PGGAN model to synthesize CT lung scans. This preliminary study focuses on synthesizing the entire volume of the lung rather than just the lung nodules. The synthesized data represent an attempt to mitigate data scarcity which is one of the major limitations to create learning models with good generalization in healthcare.

2022

Unsupervised Approach for Malignancy Assessment of Lung Nodules in Computed Tomography Scans Using Radiomic Features

Authors
Teixeira, M; Pereira, T; Silva, F; Cunha, A; Oliveira, HP;

Publication
44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC 2022, Glasgow, Scotland, United Kingdom, July 11-15, 2022

Abstract
Lung cancer is the leading cause of cancer death worldwide. Early low-dose computed tomography (CT) screening can decrease its mortality rate and computer-aided diagnoses systems may make these screenings more accessible. Radiomic features and supervised machine learning have traditionally been employed in these systems. Contrary to supervised methods, unsupervised learning techniques do not require large amounts of annotated data which are labor-intensive to gather and long training times. Therefore, recent approaches have used unsupervised methods, such as clustering, to improve the performance of supervised models. However, an analysis of purely unsupervised methods for malignancy prediction of lung nodules from CT images has not been performed. This work studies nodule malignancy in the LIDC-IDRI image collection of chest CT scans using established radiomic features and unsupervised learning methods based on k-Means, Spectral Clustering, and Gaussian Mixture clustering. All tested methods resulted in clusters of high homogeneity malignancy. Results suggest convex feature distributions and well-separated feature subspaces associated with different diagnoses. Furthermore, diagnosis uncertainty may be explained by common characteristics captured by radiomic features. The k-Means and Gaussian Mixture models are able to generalize to unseen data, achieving a balanced accuracy of 87.23% and 86.96% when inference was tested. These results motivate the usage of unsupervised approaches for malignancy prediction of lung nodules, such as cluster-then-label models. Clinical Relevance - Unsupervised clustering of radiomic features of lung nodules in chest CT scans can differentiate between malignant and benign cases and reflects experts' diagnosis uncertainty

2022

Multiple instance learning for lung pathophysiological findings detection using CT scans

Authors
Frade, J; Pereira, T; Morgado, J; Silva, F; Freitas, C; Mendes, J; Negrao, E; de Lima, BF; da Silva, MC; Madureira, AJ; Ramos, I; Costa, JL; Hespanhol, V; Cunha, A; Oliveira, HP;

Publication
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING

Abstract
Lung diseases affect the lives of billions of people worldwide, and 4 million people, each year, die prematurely due to this condition. These pathologies are characterized by specific imagiological findings in CT scans. The traditional Computer-Aided Diagnosis (CAD) approaches have been showing promising results to help clinicians; however, CADs normally consider a small part of the medical image for analysis, excluding possible relevant information for clinical evaluation. Multiple Instance Learning (MIL) approach takes into consideration different small pieces that are relevant for the final classification and creates a comprehensive analysis of pathophysiological changes. This study uses MIL-based approaches to identify the presence of lung pathophysiological findings in CT scans for the characterization of lung disease development. This work was focus on the detection of the following: Fibrosis, Emphysema, Satellite Nodules in Primary Lesion Lobe, Nodules in Contralateral Lung and Ground Glass, being Fibrosis and Emphysema the ones with more outstanding results, reaching an Area Under the Curve (AUC) of 0.89 and 0.72, respectively. Additionally, the MIL-based approach was used for EGFR mutation status prediction - the most relevant oncogene on lung cancer, with an AUC of 0.69. The results showed that this comprehensive approach can be a useful tool for lung pathophysiological characterization.

2022

Impact of Label Noise on the Learning Based Models for a Binary Classification of Physiological Signal

Authors
Ding, C; Pereira, T; Xiao, R; Lee, RJ; Hu, X;

Publication
SENSORS

Abstract
Label noise is omnipresent in the annotations process and has an impact on supervised learning algorithms. This work focuses on the impact of label noise on the performance of learning models by examining the effect of random and class-dependent label noise on a binary classification task: quality assessment for photoplethysmography (PPG). PPG signal is used to detect physiological changes and its quality can have a significant impact on the subsequent tasks, which makes PPG quality assessment a particularly good target for examining the impact of label noise in the field of biomedicine. Random and class-dependent label noise was introduced separately into the training set to emulate the errors associated with fatigue and bias in labeling data samples. We also tested different representations of the PPG, including features defined by domain experts, 1D raw signal and 2D image. Three different classifiers are tested on the noisy training data, including support vector machine (SVM), XGBoost, 1D Resnet and 2D Resnet, which handle three representations, respectively. The results showed that the two deep learning models were more robust than the two traditional machine learning models for both the random and class-dependent label noise. From the representation perspective, the 2D image shows better robustness compared to the 1D raw signal. The logits from three classifiers are also analyzed, the predicted probabilities intend to be more dispersed when more label noise is introduced. From this work, we investigated various factors related to label noise, including representations, label noise type, and data imbalance, which can be a good guidebook for designing more robust methods for label noise in future work.

2023

Special Issue on Novel Applications of Artificial Intelligence in Medicine and Health

Authors
Pereira, T; Cunha, A; Oliveira, HP;

Publication
APPLIED SCIENCES-BASEL

Abstract
Artificial Intelligence (AI) is one of the big hopes for the future of a positive revolution in the use of medical data to improve clinical routine and personalized medicine [...]

  • 8
  • 15