Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by José Orlando Pereira

2018

Falcon: A Practical Log-based Analysis Tool for Distributed Systems

Authors
Neves, F; Machado, N; Pereira, J;

Publication
2018 48TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN)

Abstract
Programmers and support engineers typically rely on log data to narrow down the root cause of unexpected behaviors in dependable distributed systems. Unfortunately, the inherently distributed nature and complexity of such distributed executions often leads to multiple independent logs, scattered across different physical machines, with thousands or millions entries poorly correlated in terms of event causality. This renders log-based debugging a tedious, time-consuming, and potentially inconclusive task. We present Falcon, a tool aimed at making log-based analysis of distributed systems practical and effective. Falcon's modular architecture, designed as an extensible pipeline, allows it to seamlessly combine several distinct logging sources and generate a coherent space-time diagram of distributed executions. To preserve event causality, even in the presence of logs collected from independent unsynchronized machines, Falcon introduces a novel happens-before symbolic formulation and relies on an off-the-shelf constraint solver to obtain a coherent event schedule. Our case study with the popular distributed coordination service Apache Zookeeper shows that Falcon eases the log-based analysis of complex distributed protocols and is helpful in bridging the gap between protocol design and implementation.

2019

Towards Intra-Datacentre High-Availability in CloudDBAppliance

Authors
Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;

Publication
CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE

Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between data centres. This paper presents the proposed replication architecture for providing fault tolerance and high availability within a data centre. This layer configuration, along with specific deployment constraints require a custom replication architecture. In particular, replication must be implemented at the middleware-level, to avoid constraining the backing operational database. This paper is focused on the design of the CDBA Replication Manager along with an evaluation, using micro-benchmarking, of components for the replication middleware. Results show the impact, on both throughput and latency, of the replication mechanisms in place.

2019

Recovery in CloudDBAppliance's High-availability Middleware

Authors
Abreu, H; Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;

Publication
PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA)

Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between datacentres. This paper presents the recovery mechanisms in place to fulfill the provision of high-availability within a datacentre. The recovery mechanism takes advantage of CDBA's in-middleware replication mechanism to bring failed replicas up-to-date. Along with the description of different variants of the recovery mechanism, this paper provides their comparative evaluation, focusing on the time it takes to recover a failed replica and how the recovery process impacts throughput.

2019

TRUSTFS: An SGX-enabled Stackable File System Framework

Authors
Esteves, T; Macedo, R; Faria, A; Portela, B; Paulo, J; Pereira, J; Harnik, D;

Publication
2019 38TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS WORKSHOPS (SRDSW 2019)

Abstract
Data confidentiality in cloud services is commonly ensured by encrypting information before uploading it. However, this approach limits the use of content-aware functionalities, such as deduplication and compression. Although this issue has been addressed individually for some of these functionalities, no unified framework for building secure storage systems exists that can leverage such operations over encrypted data. We present TRUSTFS, a programmable and modular stackable file system framework for implementing secure content-aware storage functionalities over hardware-assisted trusted execution environments. This framework extends the original SAFEFS architecture to provide the isolated execution guarantees of Intel SGX. We demonstrate its usability by implementing an SGX-enabled stackable file system prototype while a preliminary evaluation shows that it incurs reasonable performance overhead when compared to conventional storage systems. Finally, we highlight open research challenges that must be further pursued in order for TRUSTFS to be fully adequate for building production-ready secure storage solutions.

2019

A Case for Dynamically Programmable Storage Background Tasks

Authors
Macedo, R; Faria, A; Paulo, J; Pereira, J;

Publication
2019 38TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS WORKSHOPS (SRDSW 2019)

Abstract
Modern storage infrastructures feature long and complicated I/O paths composed of several layers, each employing their own optimizations to serve varied applications with fluctuating requirements. However, as these layers do not have global infrastructure visibility, they are unable to optimally tune their behavior to achieve maximum performance. Background storage tasks, in particular, can rapidly overload shared resources, but are executed either periodically or whenever a certain threshold is hit regardless of the overall load on the system. In this paper, we argue that to achieve optimal holistic performance, these tasks should be dynamically programmable and handled by a controller with global visibility. To support this argument, we evaluate the impact on performance of compaction and checkpointing in the context of HBase and PostgreSQL. We find that these tasks can respectively increase 99th percentile latencies by 955.2% and 61.9%. We also identify future research directions to achieve programmable background tasks.

2019

Distributed Applications and Interoperable Systems - 19th IFIP WG 6.1 International Conference, DAIS 2019, Held as Part of the 14th International Federated Conference on Distributed Computing Techniques, DisCoTec 2019, Kongens Lyngby, Denmark, June 17-21, 2019, Proceedings

Authors
Pereira, J; Ricci, L;

Publication
DAIS

Abstract

  • 7
  • 20