Publicacoes - INESC TEC

Publicações

Publicações por Paulo Sérgio Almeida

2017

Fault-tolerant aggregation: Flow-Updating meets Mass-Distribution

Autores
Almeida, PS; Baquero, C; Farach Colton, M; Jesus, P; Mosteiro, MA;

Publicação
DISTRIBUTED COMPUTING

Abstract
Flow-Updating (FU) is a fault-tolerant technique that has proved to be efficient in practice for the distributed computation of aggregate functions in communication networks where individual processors do not have access to global information. Previous distributed aggregation protocols, based on repeated sharing of input values (or mass) among processors, sometimes called Mass-Distribution (MD) protocols, are not resilient to communication failures (or message loss) because such failures yield a loss of mass. In this paper, we present a protocol which we call Mass-Distribution with Flow-Updating (MDFU). We obtain MDFU by applying FU techniques to classic MD. We analyze the convergence time of MDFU showing that stochastic message loss produces low overhead. This is the first convergence proof of an FU-based algorithm. We evaluate MDFU experimentally, comparing it with previous MD and FU protocols, and verifying the behavior predicted by the analysis. Finally, given that MDFU incurs a fixed deviation proportional to the message-loss rate, we adjust the accuracy of MDFU heuristically in a new protocol called MDFU with Linear Prediction (MDFU-LP). The evaluation shows that both MDFU and MDFU-LP behave very well in practice, even under high rates of message loss and even changing the input values dynamically.

FecharLer Abstract

2017

COMPOSITION IN STATE-BASED REPLICATED DATA TYPES

Autores
Baquero, C; Almeida, PS; Cunha, A; Ferreira, C;

Publicação
BULLETIN OF THE EUROPEAN ASSOCIATION FOR THEORETICAL COMPUTER SCIENCE

Abstract
Keeping replicated data strongly consistent is convenient when communication is fast and available. In internet-scale distributed systems the reality of high communication latencies and likelihood of partitions, leads developers to adopt more relaxed consistency models, such as eventual consistency. Conflict-free Replicated Data Types, bring structure to the design of eventually consistent data management solutions, by precisely describing the behaviour under concurrent updates and guarantying a path to reconciliation. This paper offers a survey of the mathematical structures that support state based multi-master replication with reconciliation, and shows how state structures and state transformations can be composed to provide data types that are now used in practice in many geo-replicated systems.

FecharLer Abstract

2018

Delta State replicated data types

Autores
Almeida, PS; Shoker, A; Baquero, C;

Publicação
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING

Abstract
Conflict-free Replicated Data Types (CRDTs) are distributed data types that make eventual consistency of a distributed object possible and non ad-hoc. Specifically, state-based CRDTs ensure convergence through disseminating the entire state, that may be large, and merging it to other replicas. We introduce Delta State Conflict-Free Replicated Data Types (delta-CRDT) that can achieve the best of both operation-based and state-based CRDTs: small messages with an incremental nature, as in operation-based CRDTs, disseminated over unreliable communication channels, as in traditional state-based CRDTs. This is achieved by defining delta-mutators to return a delta-state, typically with a much smaller size than the full state, that to be joined with both local and remote states. We introduce the delta-CRDT framework, and we explain it through establishing a correspondence to current state-based CRDTs. In addition, we present an anti-entropy algorithm for eventual convergence, and another one that ensures causal consistency. Finally, we introduce several delta-CRDT specifications of both well-known replicated datatypes and novel datatypes, including a generic map composition.

FecharLer Abstract

2019

Scalable eventually consistent counters over unreliable networks

Autores
Almeida, PS; Baquero, C;

Publicação
DISTRIBUTED COMPUTING

Abstract
Counters are an important abstraction in distributed computing, and play a central role in large scale geo-replicated systems, counting events such as web page impressions or social network likes. Classic distributed counters, strongly consistent via linearisability or sequential consistency, cannot be made both available and partition-tolerant, due to the CAP Theorem, being unsuitable to large scale scenarios. This paper defines Eventually Consistent Distributed Counters (ECDCs) and presents an implementation of the concept, Handoff Counters, that is scalable and works over unreliable networks. By giving up the total operation ordering in classic distributed counters, ECDC implementations can be made AP in the CAP design space, while retaining the essence of counting. Handoff Counters are the first Conflict-free Replicated Data Type (CRDT) based mechanism that overcomes the identity explosion problem in naive CRDTs, such as G-Counters (where state size is linear in the number of independent actors that ever incremented the counter), by managing identities towards avoiding global propagation and garbage collecting temporary entries. The approach used in Handoff Counters is not restricted to counters, being more generally applicable to other data types with associative and commutative operations.

FecharLer Abstract

2019

Higher-Order Patterns in Replicated Data Types

Autores
Leijnse, A; Almeida, PS; Baquero, C;

Publicação
Proceedings of the 6th Workshop on Principles and Practice of Consistency for Distributed Data, PaPoC@EuroSys 2019, Dresden, Germany, March 25-28, 2019

Abstract
The design of Conflict-free Replicated Data Types traditionally requires implementing new designs from scratch to meet a desired behavior. Although there are composition rules that can guide the process, there has not been a lot of work explaining how existing data types relate to each other, nor work that factors out common patterns. To bring clarity to the field we explain underlying patterns that are common to flags, sets, and registers. The identified patterns are succinct and composable, which gives them the power to explain both current designs and open up the space for new ones. © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.

FecharLer Abstract

2019

Efficient Synchronization of State-based CRDTs

Autores
Enes, V; Almeida, PS; Baquero, C; Leitao, J;

Publicação
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019)

Abstract
To ensure high availability in large scale distributed systems, Conflict-free Replicated Data Types (CRDTs) relax consistency by allowing immediate query and update operations at the local replica, with no need for remote synchronization. State-based CRDTs synchronize replicas by periodically sending their full state to other replicas, which can become extremely costly as the CRDT state grows. Delta-based CRDTs address this problem by producing small incremental states (deltas) to be used in synchronization instead of the full state. However, current synchronization algorithms for delta-based CRDTs induce redundant wasteful delta propagation, performing worse than expected, and surprisingly, no better than state-based. In this paper we: 1) identify two sources of inefficiency in current synchronization algorithms for delta-based CRDTs; 2) bring the concept of join decomposition to state-based CRDTs; 3) exploit join decompositions to obtain optimal deltas and 4) improve the efficiency of synchronization algorithms; and finally, 5) experimentally evaluate the improved algorithms.

FecharLer Abstract