Pedro Gabriel Ferreira

Cookies Policy

The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More

Institution
Research
Research Domains
Artificial Intelligence

Bioengineering

Communications

Computer Science and Engineering
Photonics

Power and Energy Systems

Robotics

Systems Engineering and Management
RESEARCH CENTERS
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Innovation
Innovation / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Available Technologies
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratories
Research Laboratories

iilab
Communication
News

Events

Media

Newsletter
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Work with us
Contacts

Home
People
Pedro Gabriel Ferreira

Read Full presentation

Pedro G. Ferreira graduated in Systems and Informatics Engineering (2002) and completed a PhD in Artificial Intelligence from University of Minho (2007). He was a Postdoctoral Fellow at Center for Genomic Regulation, Barcelona (2008-2012) and at University of Geneva (2012-2014). He participated in several major international consortia including ICGC-CLL, ENCODE, GEUVADIS and GTEx. Currently, he is an Assistant Professor at the Department of Computer Science, Faculty of Sciences of University of Porto and a researcher at INESCTEC-LIADD and i3s/Ipatimup. His main research focus is in genomic data science. In particular, he is interested in unraveling the role of genomics on the human health and disease. He has been involved in several bioinformatics start-ups.

Read Full presentation

About

Interest
Topics

Details

Name
Pedro Gabriel Ferreira
Role
Senior Researcher
Since
20th September 2018

Nationality
Portugal
Centre
Artificial Intelligence and Decision Support
Contacts
+351220402963
pedro.g.ferreira@inesctec.pt

001

Publications

View all Publications

2024

A Distributed Computing Solution for Privacy-Preserving Genome-Wide Association Studies

Authors
Brito, C; Ferreira, P; Paulo, J;

Publication

Abstract
AbstractBreakthroughs in sequencing technologies led to an exponential growth of genomic data, providing unprecedented biological in-sights and new therapeutic applications. However, analyzing such large amounts of sensitive data raises key concerns regarding data privacy, specifically when the information is outsourced to third-party infrastructures for data storage and processing (e.g., cloud computing). Current solutions for data privacy protection resort to centralized designs or cryptographic primitives that impose considerable computational overheads, limiting their applicability to large-scale genomic analysis.We introduce Gyosa, a secure and privacy-preserving distributed genomic analysis solution. Unlike in previous work, Gyosafollows a distributed processing design that enables handling larger amounts of genomic data in a scalable and efficient fashion. Further, by leveraging trusted execution environments (TEEs), namely Intel SGX, Gyosaallows users to confidentially delegate their GWAS analysis to untrusted third-party infrastructures. To overcome the memory limitations of SGX, we implement a computation partitioning scheme within Gyosa. This scheme reduces the number of operations done inside the TEEs while safeguarding the users’ genomic data privacy. By integrating this security scheme inGlow, Gyosaprovides a secure and distributed environment that facilitates diverse GWAS studies. The experimental evaluation validates the applicability and scalability of Gyosa, reinforcing its ability to provide enhanced security guarantees. Further, the results show that, by distributing GWASes computations, one can achieve a practical and usable privacy-preserving solution.

CloseRead Abstract Read Full Publication

2024

Integration of multi-modal datasets to estimate human aging

Authors
Ribeiro, R; Moraes, A; Moreno, M; Ferreira, PG;

Publication
MACHINE LEARNING

Abstract
Aging involves complex biological processes leading to the decline of living organisms. As population lifespan increases worldwide, the importance of identifying factors underlying healthy aging has become critical. Integration of multi-modal datasets is a powerful approach for the analysis of complex biological systems, with the potential to uncover novel aging biomarkers. In this study, we leveraged publicly available epigenomic, transcriptomic and telomere length data along with histological images from the Genotype-Tissue Expression project to build tissue-specific regression models for age prediction. Using data from two tissues, lung and ovary, we aimed to compare model performance across data modalities, as well as to assess the improvement resulting from integrating multiple data types. Our results demostrate that methylation outperformed the other data modalities, with a mean absolute error of 3.36 and 4.36 in the test sets for lung and ovary, respectively. These models achieved lower error rates when compared with established state-of-the-art tissue-agnostic methylation models, emphasizing the importance of a tissue-specific approach. Additionally, this work has shown how the application of Hierarchical Image Pyramid Transformers for feature extraction significantly enhances age modeling using histological images. Finally, we evaluated the benefits of integrating multiple data modalities into a single model. Combining methylation data with other data modalities only marginally improved performance likely due to the limited number of available samples. Combining gene expression with histological features yielded more accurate age predictions compared with the individual performance of these data types. Given these results, this study shows how machine learning applications can be extended to/in multi-modal aging research. Code used is available at https://github.com/zroger49/multi_modal_age_prediction.

CloseRead Abstract

2024

The molecular impact of cigarette smoking resembles aging across tissues

Authors
Ramirez, JM; Ribeiro, R; Soldatkina, O; Moraes, A; García-Pérez, R; Ferreira, PG; Melé, M;

Publication

Abstract
AbstractTobacco smoke is the main cause of preventable mortality worldwide. Smoking increases the risk of developing many diseases and has been proposed as an aging accelerator. Yet, the molecular mechanisms driving smoking-related health decline and aging acceleration in most tissues remain unexplored. Here, we characterize gene expression, alternative splicing, DNA methylation and histological alterations induced by cigarette smoking across human tissues. We show that smoking impacts tissue architecture and triggers systemic inflammation. We find that in many tissues, the effects of smoking significantly overlap those of aging in the same direction. Specifically, both age and smoking upregulate inflammatory genes and drive hypomethylation at enhancers. In addition, we observe widespread smoking-driven hypermethylation at target regions of the Polycomb repressive complex, which is a well-known aging effect. Smoking-induced epigenetic changes overlap causal aging CpGs, suggesting that these methylation changes may directly mediate aging acceleration observed in smokers. Finally, we find that smoking effects that are shared with aging are more persistent over time. Overall, our multi-tissue and multi-omic analysis of the effects of cigarette smoking provides an extensive characterization of the impact of tobacco smoke across tissues and unravels the molecular mechanisms driving smoking-induced tissue homeostasis decline and aging acceleration.

CloseRead Abstract

2024

APAtizer: a tool for alternative polyadenylation analysis of RNA-Seq data

Authors
Sousa, B; Bessa, M; de Mendonca, FL; Ferreira, PG; Moreira, A; Pereira-Castro, I;

Publication
BIOINFORMATICS

Abstract
APAtizer is a tool designed to analyze alternative polyadenylation events on RNA-sequencing data. The tool handles different file formats, including BAM, htseq, and DaPars bedGraph files. It provides a user-friendly interface that allows users to generate informative visualizations, including Volcano plots, heatmaps, and gene lists. These outputs allow the user to retrieve useful biological insights such as the occurrence of polyadenylation events when comparing two biological conditions. In addition, it can perform differential gene expression, gene ontology analysis, visualization of Venn diagram intersections, and correlation analysis.

CloseRead Abstract

2023

A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer

Authors
Baptista, D; Ferreira, PG; Rocha, M;

Publication
PLOS COMPUTATIONAL BIOLOGY

Abstract
Author summaryCancer therapies often fail because tumor cells become resistant to treatment. One way to overcome resistance is by treating patients with a combination of two or more drugs. Some combinations may be more effective than when considering individual drug effects, a phenomenon called drug synergy. Computational drug synergy prediction methods can help to identify new, clinically relevant drug combinations. In this study, we developed several deep learning models for drug synergy prediction. We examined the effect of using different types of deep learning architectures, and different ways of representing drugs and cancer cell lines. We explored the use of biological prior knowledge to select relevant cell line features, and also tested data-driven feature reduction methods. We tested both precomputed drug features and deep learning methods that can directly learn features from raw representations of molecules. We also evaluated whether including genomic features, in addition to gene expression data, improves the predictive performance of the models. Through these experiments, we were able to identify strategies that will help guide the development of new deep learning models for drug synergy prediction in the future. One of the main obstacles to the successful treatment of cancer is the phenomenon of drug resistance. A common strategy to overcome resistance is the use of combination therapies. However, the space of possibilities is huge and efficient search strategies are required. Machine Learning (ML) can be a useful tool for the discovery of novel, clinically relevant anti-cancer drug combinations. In particular, deep learning (DL) has become a popular choice for modeling drug combination effects. Here, we set out to examine the impact of different methodological choices on the performance of multimodal DL-based drug synergy prediction methods, including the use of different input data types, preprocessing steps and model architectures. Focusing on the NCI ALMANAC dataset, we found that feature selection based on prior biological knowledge has a positive impact-limiting gene expression data to cancer or drug response-specific genes improved performance. Drug features appeared to be more predictive of drug response, with a 41% increase in coefficient of determination (R-2) and 26% increase in Spearman correlation relative to a baseline model that used only cell line and drug identifiers. Molecular fingerprint-based drug representations performed slightly better than learned representations-ECFP4 fingerprints increased R-2 by 5.3% and Spearman correlation by 2.8% w.r.t the best learned representations. In general, fully connected feature-encoding subnetworks outperformed other architectures. DL outperformed other ML methods by more than 35% (R-2) and 14% (Spearman). Additionally, an ensemble combining the top DL and ML models improved performance by about 6.5% (R-2) and 4% (Spearman). Using a state-of-the-art interpretability method, we showed that DL models can learn to associate drug and cell line features with drug response in a biologically meaningful way. The strategies explored in this study will help to improve the development of computational methods for the rational design of effective drug combinations for cancer therapy.

CloseRead Abstract

Supervised
thesis

Supervised Thesis

View all Supervised Theses

2023

New antidotes for Bothrops asper venom: a study of PLA2 protein

Author
Roberto Miguel Pais Pinto

Institution
UP-FCUP

2023

Omics-based prediction of human phenotypes using scalable machine learning approaches

Author
Marta Carolina Cabral Moreno

Institution
UP-FCUP

2023

BioPredictor: a tool to predict the outcome of molecular alterations

Author
Marta Patrícia Ribeiro Ferreira

Institution
UP-FCUP

2023

Integration of multi-modal genomics datasets with expert data: a patient centered approach to improve diagnosis and prognosis

Author
Rogério Eduardo Ramos Ribeiro

Institution
UP-FCUP

2023

Unravelling the Complexity of Human Disease: Transcriptomic Networks of Phenotype - Gene Expression Data

Author
Darmit Manish Kumar

Institution
UP-FCUP

View all Supervised Theses

Pedro Gabriel Ferreira

About

Details

Name

Role

Since

Nationality

Centre

Contacts

AI4REALNET

A Distributed Computing Solution for Privacy-Preserving Genome-Wide Association Studies

Integration of multi-modal datasets to estimate human aging

The molecular impact of cigarette smoking resembles aging across tissues

APAtizer: a tool for alternative polyadenylation analysis of RNA-Seq data

A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer

New antidotes for Bothrops asper venom: a study of PLA2 protein

Omics-based prediction of human phenotypes using scalable machine learning approaches

BioPredictor: a tool to predict the outcome of molecular alterations

Integration of multi-modal genomics datasets with expert data: a patient centered approach to improve diagnosis and prognosis

Unravelling the Complexity of Human Disease: Transcriptomic Networks of Phenotype - Gene Expression Data