Publicacoes - INESC TEC

Publicações

Publicações por Fábio André Coelho

2014

pH1: A Transactional Middleware for NoSQL

Autores
Coelho, F; Cruz, F; Vilaca, R; Pereira, J; Oliveira, R;

Publicação
2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS)

Abstract
NoSQL databases opt not to offer important abstractions traditionally found in relational databases in order to achieve high levels of scalability and availability: transactional guarantees and strong data consistency. In this work we propose pH1, a generic middleware layer over NoSQL databases that offers transactional guarantees with Snapshot Isolation. This is achieved in a non-intrusive manner, requiring no modifications to servers and no native support for multiple versions. Instead, the transactional context is achieved by means of a multiversion distributed cache and an external transaction certifier, exposed by extending the client's interface with transaction bracketing primitives. We validate and evaluate pH1 with Apache Cassandra and Hyperdex. First, using the YCSB benchmark, we show that the cost of providing ACID guarantees to these NoSQL databases amounts to 11% decrease in throughput. Moreover, using the transaction intensive TPC-C workload, pH1 presented an impact of 22% decrease in throughput. This contrasts with OMID, a previous proposal that takes advantage of HBase's support for multiple versions, with a throughput penalty of 76% in the same conditions

FecharLer Abstract

2016

Towards Quantifiable Eventual Consistency

Autores
Maia, F; Matos, M; Coelho, F;

Publicação
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, VOL 1 (CLOSER)

Abstract
In the pursuit of highly available systems, storage systems began offering eventually consistent data models. These models are suitable for a number of applications but not applicable for all. In this paper we discuss a system that can offer a eventually consistent data model but can also, when needed, offer a strong consistent one.

FecharLer Abstract

2017

HTAPBench: Hybrid Transactional and Analytical Processing Benchmark

Autores
Coelho, F; Paulo, J; Vilaça, R; Pereira, JO; Oliveira, R;

Publicação
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, ICPE 2017, L'Aquila, Italy, April 22-26, 2017

Abstract
The increasing demand for real-time analytics requires the fusion of Transactional (OLTP) and Analytical (OLAP) systems, eschewing ETL processes and introducing a plethora of proposals for the so-called Hybrid Analytical and Trans-actional Processing (HTAP) systems. Unfortunately, current benchmarking approaches are not able to comprehensively produce a unified metric from the assessment of an HTAP system. The evaluation of both engine types is done separately, leading to the use of disjoint sets of benchmarks such as TPC-C or TPC-H. In this paper we propose a new benchmark, HTAPBench, providing a unified metric for HTAP systems geared toward the execution of constantly increasing OLAP requests limited by an admissible impact on OLTP performance. To achieve this, a load balancer within HTAPBench regulates the coexistence of OLTP and OLAP workloads, proposing a method for the generation of both new data and requests, so that OLAP requests over freshly modified data are comparable across runs. We demonstrate the merit of our approach by validating it with different types of systems: OLTP, OLAP and HTAP; showing that the benchmark is able to highlight the differences between them, while producing queries with comparable complexity across experiments with negligible variability. © 2017 ACM.

FecharLer Abstract

2019

Towards Intra-Datacentre High-Availability in CloudDBAppliance

Autores
Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;

Publicação
CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE

Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between data centres. This paper presents the proposed replication architecture for providing fault tolerance and high availability within a data centre. This layer configuration, along with specific deployment constraints require a custom replication architecture. In particular, replication must be implemented at the middleware-level, to avoid constraining the backing operational database. This paper is focused on the design of the CDBA Replication Manager along with an evaluation, using micro-benchmarking, of components for the replication middleware. Results show the impact, on both throughput and latency, of the replication mechanisms in place.

FecharLer Abstract

2019

Recovery in CloudDBAppliance's High-availability Middleware

Autores
Abreu, H; Ferreira, L; Coelho, F; Alonso, AN; Pereira, J;

Publicação
PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA)

Abstract
In the context of the CloudDBAppliance (CDBA) project, fault tolerance and high-availability are provided in layers: within each appliance, within a data centre and between datacentres. This paper presents the recovery mechanisms in place to fulfill the provision of high-availability within a datacentre. The recovery mechanism takes advantage of CDBA's in-middleware replication mechanism to bring failed replicas up-to-date. Along with the description of different variants of the recovery mechanism, this paper provides their comparative evaluation, focusing on the time it takes to recover a failed replica and how the recovery process impacts throughput.

FecharLer Abstract

2019

Minha: Large-Scale Distributed Systems Testing Made Practical

Autores
Machado, N; Maia, F; Neves, F; Coelho, F; Pereira, J;

Publicação
23rd International Conference on Principles of Distributed Systems, OPODIS 2019, December 17-19, 2019, Neuchâtel, Switzerland.

Abstract
Testing large-scale distributed system software is still far from practical as the sheer scale needed and the inherent non-determinism make it very expensive to deploy and use realistically large environments, even with cloud computing and state-of-the-art automation. Moreover, observing global states without disturbing the system under test is itself difficult. This is particularly troubling as the gap between distributed algorithms and their implementations can easily introduce subtle bugs that are disclosed only with suitably large scale tests. We address this challenge with Minha, a framework that virtualizes multiple JVM instances in a single JVM, thus simulating a distributed environment where each host runs on a separate machine, accessing dedicated network and CPU resources. The key contributions are the ability to run off-the-shelf concurrent and distributed JVM bytecode programs while at the same time scaling up to thousands of virtual nodes; and enabling global observation within standard software testing frameworks. Our experiments with two distributed systems show the usefulness of Minha in disclosing errors, evaluating global properties, and in scaling tests orders of magnitude with the same hardware resources. © Nuno Machado, Francisco Maia, Francisco Neves, Fábio Coelho, and José Pereira; licensed under Creative Commons License CC-BY 23rd International Conference on Principles of Distributed Systems (OPODIS 2019).

FecharLer Abstract