Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I am currently an auxiliar professor at University of Minho and senior researcher at INESC TEC. I have obtained a PhD degree in Computer Science from the MAP-i Doctoral Program in Computer Science. Currently, I am working on large scale distributed systems with an emphasis on storage and database systems’ scalability, performance, security and dependability. Also, I am interested on the applicability of this research work for solving complex data management challenges for Cloud Computing and HPC centres.

I am the coordinator of the PAStor PT-UTAustin exploratory project and the “Efficient and Secure Data Management for HPC and Cloud Computing” CENTRA project, while leading INESC TEC’s activities on the Compete2020 BigHPC project and ACTPM PT-UTAustin exploratory project. Also, I have several publications in renowned journals and international conferences (e.g., ACM Computing Surveys, IEEE Transactions on Computers, ACM Transactions on Storage, Eurosys, SRDS, SYSTOR).

For more information you can check my personal web page at https://jtpaulo.github.io, as well as, HASLab's research lines on the topics mentioned above: https://dsr-haslab.github.io and https://dbr-haslab.github.io

Interest
Topics
Details

Details

  • Name

    João Tiago Paulo
  • Role

    Senior Researcher
  • Since

    01st November 2011
007
Publications

2024

When Amnesia Strikes: Understanding and Reproducing Data Loss Bugs with Fault Injection

Authors
Ramos, M; Azevedo, J; Kingsbury, K; Pereira, J; Esteves, T; Macedo, R; Paulo, J;

Publication
Proc. VLDB Endow.

Abstract
We present LazyFS, a new fault injection tool that simplifies the debugging and reproduction of complex data durability bugs experienced by databases, key-value stores, and other data-centric systems in crashes. Our tool simulates persistence properties of POSIX file systems (e.g., operations ordering and atomicity) and enables users to inject lost and torn write faults with a precise and controlled approach. Further, it provides profiling information about the system’s operations flow and persisted data, enabling users to better understand the root cause of errors. Weuse LazyFS to study seven important systems: PostgreSQL, etcd, Zookeeper, Redis, LevelDB, PebblesDB, and Lightning Network. Our fault injection campaign shows that LazyFS automates and facilitates the reproduction of five known bug reports containing manual and complex reproducibility steps. Further, it aids in understanding and reproducing seven ambiguous bugs reported by users. Finally, LazyFS is used to find eight new bugs, which lead to data loss, corruption, and unavailability.

2023

Distributed Applications and Interoperable Systems - 23rd IFIP WG 6.1 International Conference, DAIS 2023, Held as Part of the 18th International Federated Conference on Distributed Computing Techniques, DisCoTec 2023, Lisbon, Portugal, June 19-23, 2023, Proceedings

Authors
Martínez, MP; Paulo, J;

Publication
DAIS

Abstract

2023

Soteria: Preserving Privacy in Distributed Machine Learning

Authors
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;

Publication
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.

2023

Diagnosing applications' I/O behavior through system call observability

Authors
Esteves, T; Macedo, R; Oliveira, R; Paulo, J;

Publication
CoRR

Abstract

2023

Taming Metadata-intensive HPC Jobs Through Dynamic, Application-agnostic QoS Control

Authors
Macedo, R; Miranda, M; Tanimura, Y; Haga, J; Ruhela, A; Harrell, SL; Evans, RT; Pereira, J; Paulo, J;

Publication
2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID

Abstract
Modern I/O applications that run on HPC infrastructures are increasingly becoming read and metadata intensive. However, having multiple applications submitting large amounts of metadata operations can easily saturate the shared parallel file system's metadata resources, leading to overall performance degradation and I/O unfairness. We present PADLL, an application and file system agnostic storage middleware that enables QoS control of data and metadata workflows in HPC storage systems. It adopts ideas from Software-Defined Storage, building data plane stages that mediate and rate limit POSIX requests submitted to the shared file system, and a control plane that holistically coordinates how all I/O workflows are handled. We demonstrate its performance and feasibility under multiple QoS policies using synthetic benchmarks, real-world applications, and traces collected from a production file system. Results show that PADLL can enforce complex storage QoS policies over concurrent metadata-aggressive jobs, ensuring fairness and prioritization.

Supervised
thesis

2023

MulletBench: Multi-layer Edge Time Series Database Benchmark

Author
Pedro Pereira

Institution
INESCTEC

2023

User-level software-defined storage data planes

Author
Ricardo Gonçalves Macedo

Institution
INESCTEC

2023

Otimizações de Armazenamento Distribuído para Aprendizagem Profunda

Author
Maria Beatriz Moreira

Institution
INESCTEC

2023

Towards a Privacy-Preserving Distributed Machine Learning Framework

Author
Cláudia Vanessa Martins de Brito

Institution
INESCTEC

2023

Injeção de Faltas Reprodutível em Sistemas de Armazenamento Local

Author
Maria Ramos

Institution
INESCTEC