Publications

Publications by Cláudia Vanessa Brito

2023

Soteria: Preserving Privacy in Distributed Machine Learning

Authors
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;

Publication
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.

CloseRead Abstract

2017

Continuous Ambulatory Peritoneal Dialysis: Business Intelligence applied to patient monitoring CAPD study and statistics

Authors
Peixoto, C; Brito, C; Fontainhas, M; Peixoto, H; Machado, J; Abelha, A;

Publication
2017 5TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW) 2017

Abstract
Continuous Ambulatory Peritoneal Dialysis (CAPD) is one of the many treatments for patients with advanced kidney disease. It is a treatment that needs regular monitoring and understanding of all the factors of blood and urine samples of each patient to understand if the treatment is going well. This article will explore data information from patients undergoing CAPD procedure. This data information helps to comprehend how interoperability acts in a Health Information System since this data contains patients' personal information but also patients' blood and urine samples' results, meaning all the services must be connected. In this work, it is used Business Intelligence process to prove that all the information available can be useful to understand the treatment above-mentioned and also how can several factors influence or not the number of patients going through kidney failure and CAPD by the study of indicators.

CloseRead Abstract

2023

Privacy-Preserving Machine Learning on Apache Spark

Authors
Brito, CV; Ferreira, PG; Portela, BL; Oliveira, RC; Paulo, JT;

Publication
IEEE ACCESS

Abstract
The adoption of third-party machine learning (ML) cloud services is highly dependent on the security guarantees and the performance penalty they incur on workloads for model training and inference. This paper explores security/performance trade-offs for the distributed Apache Spark framework and its ML library. Concretely, we build upon a key insight: in specific deployment settings, one can reveal carefully chosen non-sensitive operations (e.g. statistical calculations). This allows us to considerably improve the performance of privacy-preserving solutions without exposing the protocol to pervasive ML attacks. In more detail, we propose Soteria, a system for distributed privacy-preserving ML that leverages Trusted Execution Environments (e.g. Intel SGX) to run computations over sensitive information in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41% when compared to previous related work. Our protocol is accompanied by a security proof and a discussion regarding resilience against a wide spectrum of ML attacks.

CloseRead Abstract

2023

Promoting sustainable and personalised travel behaviours while preserving data privacy

Authors
Pina, N; Brito, C; Vitorino, R; Cunha, I;

Publication
Transportation Research Procedia

Abstract
Cities worldwide have agreed on ambitious goals regarding carbon neutrality; thus, smart cities face challenges regarding active and shared mobility due to public transportation's low attractiveness and lack of real-time multimodal information. These issues have led to a lack of data on the community's mobility choices, traffic commuters' carbon footprint and corresponding low motivation to change habits. Besides, many consumers are reluctant to use some software tools due to the lack of data privacy guarantee. This paper presents a methodology developed in the FranchetAI project that addrebes these issues by providing distributed privacy-preserving machine learning models that identify travel behaviour patterns and respective GHG emissions to recommend alternative options. Also, the paper presents the developed FranchetAI mobile prototype. © 2023 The Authors. Published by ELSEVIER B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)

CloseRead Abstract

2024

A Distributed Computing Solution for Privacy-Preserving Genome-Wide Association Studies

Authors
Brito, C; Ferreira, P; Paulo, J;

Publication

Abstract
AbstractBreakthroughs in sequencing technologies led to an exponential growth of genomic data, providing unprecedented biological in-sights and new therapeutic applications. However, analyzing such large amounts of sensitive data raises key concerns regarding data privacy, specifically when the information is outsourced to third-party infrastructures for data storage and processing (e.g., cloud computing). Current solutions for data privacy protection resort to centralized designs or cryptographic primitives that impose considerable computational overheads, limiting their applicability to large-scale genomic analysis.We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. Unlike in previous work, Gyosafollows a distributed processing design that enables handling larger amounts of genomic data in a scalable and efficient fashion. Further, by leveraging trusted execution environments (TEEs), namely Intel SGX, Gyosaallows users to confidentially delegate their GWAS analysis to untrusted third-party infrastructures. To overcome the memory limitations of SGX, we implement a computation partitioning scheme within Gyosa. This scheme reduces the number of operations done inside the TEEs while safeguarding the users’ genomic data privacy. By integrating this security scheme inGlow, Gyosaprovides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees. Further, the results show that, by distributing GWASes computations, one can achieve a practical and usable privacy-preserving solution.

CloseRead Abstract Read Full Publication

2024

To FID or not to FID: Applying GANs for MRI Image Generation in HPC

Authors
Cepa, B; Brito, C; Sousa, A;

Publication

Abstract
AbstractWith the rapid growth of Deep Learning models and neural networks, the medical data available for training – which is already significantly less than other types of data – is becoming scarce. For that purpose, Generative Adversarial Networks (GANs) have received increased attention due to their ability to synthesize new realistic images. Our preliminary work shows promising results for brain MRI images; however, there is a need to distribute the workload, which can be supported by High-Performance Computing (HPC) environments. In this paper, we generate 256×256 MRI images of the brain in a distributed setting. We obtained an FIDRadImageNetof 10.67 for the DCGAN and 23.54 for the WGAN-GP, which are consistent with results reported in several works published in this scope. This allows us to conclude that distributing the GAN generation process is a viable option to overcome the computational constraints imposed by these models and, therefore, facilitate the generation of new data for training purposes.

CloseRead Abstract Read Full Publication