Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Sou actualmente professor auxiliar na Universidade do Minho e investigador auxiliar no INESC TEC. Obtive o grau de Doutoramento em Informática pelo Programa de Doutoral MAP-i. Actualmente, trabalho em sistemas distribuídos de larga escala, nomeadamente na escalabilidade, desempenho, segurança e fiabilidade de sistemas de armazenamento e de bases de dados. Também tenho interesse na aplicabilidade deste trabalho de investigação para a resolução de desafios complexos de gestão de dados para Computação em Nuvem e Computação Avançada.

Sou o coordenador do projecto exploratório PT-UTAustin PAStor e do projecto CENTRA "Efficient and Secure Data Management for HPC and Cloud Computing", enquanto lidero as actividades do INESC TEC no projecto Compete2020 BigHPC e no projecto exploratório PT-UTAustin ACTPM. Além disso, tenho várias publicações em revistas de renome e conferências internacionais (por exemplo, ACM Computing Surveys, IEEE Transactions on Computers, ACM Transactions on Storage, Eurosys, SRDS, SYSTOR).

Para mais informações pode consultar a minha página pessoal web em https://jtpaulo.github.io, bem como as linhas de investigação do HASLab sobre os tópicos acima mencionados: https://dsr-haslab.github.io e https://dbr-haslab.github.io

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    João Tiago Paulo
  • Cargo

    Investigador Sénior
  • Desde

    01 novembro 2011
007
Publicações

2024

When Amnesia Strikes: Understanding and Reproducing Data Loss Bugs with Fault Injection

Autores
Ramos, M; Azevedo, J; Kingsbury, K; Pereira, J; Esteves, T; Macedo, R; Paulo, J;

Publicação
Proc. VLDB Endow.

Abstract
We present LazyFS, a new fault injection tool that simplifies the debugging and reproduction of complex data durability bugs experienced by databases, key-value stores, and other data-centric systems in crashes. Our tool simulates persistence properties of POSIX file systems (e.g., operations ordering and atomicity) and enables users to inject lost and torn write faults with a precise and controlled approach. Further, it provides profiling information about the system’s operations flow and persisted data, enabling users to better understand the root cause of errors. Weuse LazyFS to study seven important systems: PostgreSQL, etcd, Zookeeper, Redis, LevelDB, PebblesDB, and Lightning Network. Our fault injection campaign shows that LazyFS automates and facilitates the reproduction of five known bug reports containing manual and complex reproducibility steps. Further, it aids in understanding and reproducing seven ambiguous bugs reported by users. Finally, LazyFS is used to find eight new bugs, which lead to data loss, corruption, and unavailability.

2023

Distributed Applications and Interoperable Systems - 23rd IFIP WG 6.1 International Conference, DAIS 2023, Held as Part of the 18th International Federated Conference on Distributed Computing Techniques, DisCoTec 2023, Lisbon, Portugal, June 19-23, 2023, Proceedings

Autores
Martínez, MP; Paulo, J;

Publicação
DAIS

Abstract

2023

Soteria: Preserving Privacy in Distributed Machine Learning

Autores
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;

Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.

2023

Diagnosing applications' I/O behavior through system call observability

Autores
Esteves, T; Macedo, R; Oliveira, R; Paulo, J;

Publicação
CoRR

Abstract

2023

Taming Metadata-intensive HPC Jobs Through Dynamic, Application-agnostic QoS Control

Autores
Macedo, R; Miranda, M; Tanimura, Y; Haga, J; Ruhela, A; Harrell, SL; Evans, RT; Pereira, J; Paulo, J;

Publicação
2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID

Abstract
Modern I/O applications that run on HPC infrastructures are increasingly becoming read and metadata intensive. However, having multiple applications submitting large amounts of metadata operations can easily saturate the shared parallel file system's metadata resources, leading to overall performance degradation and I/O unfairness. We present PADLL, an application and file system agnostic storage middleware that enables QoS control of data and metadata workflows in HPC storage systems. It adopts ideas from Software-Defined Storage, building data plane stages that mediate and rate limit POSIX requests submitted to the shared file system, and a control plane that holistically coordinates how all I/O workflows are handled. We demonstrate its performance and feasibility under multiple QoS policies using synthetic benchmarks, real-world applications, and traces collected from a production file system. Results show that PADLL can enforce complex storage QoS policies over concurrent metadata-aggressive jobs, ensuring fairness and prioritization.

Teses
supervisionadas

2023

Otimizações de Armazenamento Distribuído para Aprendizagem Profunda

Autor
Maria Beatriz Moreira

Instituição
INESCTEC

2023

Towards a Privacy-Preserving Distributed Machine Learning Framework

Autor
Cláudia Vanessa Martins de Brito

Instituição
INESCTEC

2023

Injeção de Faltas Reprodutível em Sistemas de Armazenamento Local

Autor
Maria Ramos

Instituição
INESCTEC

2023

End-to-End Software-Defined Security for Big Data Ecosystem

Autor
Tânia da Conceição Araújo Esteves

Instituição
INESCTEC

2023

Distributed and Dependable SDS Control Plane for HPC

Autor
Mariana Martins de Sá Miranda

Instituição
INESCTEC