Publicacoes - INESC TEC

Publicações

Publicações por Carlos Baquero

2014

Scalable and Accurate Causality Tracking for Eventually Consistent Stores

Autores
Sergio Almeida, PS; Baquero, C; Goncalves, R; Preguica, N; Fonte, V;

Publicação
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014)

Abstract
In cloud computing environments, data storage systems often rely on optimistic replication to provide good performance and availability even in the presence of failures or network partitions. In this scenario, it is important to be able to accurately and efficiently identify updates executed concurrently. Current approaches to causality tracking in optimistic replication have problems with concurrent updates: they either (1) do not scale, as they require replicas to maintain information that grows linearly with the number of writes or unique clients; (2) lose information about causality, either by removing entries from client-id based version vectors or using server-id based version vectors, which cause false conflicts. We propose a new logical clock mechanism and a logical clock framework that together support a traditional key-value store API, while capturing causality in an accurate and scalable way, avoiding false conflicts. It maintains concise information per data replica, only linear on the number of replica servers, and allows data replicas to be compared and merged linear with the number of replica servers and versions.

FecharLer Abstract

2015

A Survey of Distributed Data Aggregation Algorithms

Autores
Jesus, P; Baquero, C; Almeida, PS;

Publicação
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS

Abstract
Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.

FecharLer Abstract

2017

The Single-Writer Principle in CRDT Composition

Autores
Enes, V; Almeida, PS; Baquero, C;

Publicação
Proceedings of the Workshop on Programming Models and Languages for Distributed Computing, Barcelona, Spain, June 20, 2017

Abstract
Multi-master replication in a distributed system setting allows each node holding a replica to update and query the local replica, and disseminate updates to other nodes. Obtaining high availability typically entails allowing replicas to diverge and requires a background mechanism for re-establishing consistency. Conflict-free Replicated Data Types (CRDTs) extend standard sequential data-Types with appropriate merge functions, and often can be composed together to create more complex ones. In this work we add a generic CRDT composition approach that explores the single-writer principle. By carefully controlling which part of the composition can be updated by each replica, we can derive efficient designs that cover new usecases. After introducing the new construction we exemplify some uses, including how to emulate a simple Doodle functionality for selecting a common meeting schedule among different participants. © 2017 Association for Computing Machinery.

FecharLer Abstract

2017

Quality-Aware Reactive Programming for the Internet of Things

Autores
Proenca, J; Baquero, C;

Publicação
FUNDAMENTALS OF SOFTWARE ENGINEERING, FSEN 2017

Abstract
The reactive paradigm recently became very popular in user-interface development: updates - such as the ones from the mouse, keyboard, or from the network - can trigger a chain of computations organised in a dependency graph, letting the underlying engine control the scheduling of these computations. In the context of the Internet of Things (IoT), typical applications deploy components in distributed nodes and link their interfaces, employing a publish-subscribe architecture. The paradigm for Distributed Reactive Programming marries these two concepts, treating each distributed component as a reactive computation. However, existing approaches either require expensive synchronisation mechanisms or they do not support pipelining, i.e., allowingmultiple "waves" of updates to be executed in parallel. We propose Quarp (Quality-Aware Reactive Programming), a scalable and light-weight mechanism aimed at the IoT to orchestrate components triggered by updates of data-producing components or of aggregating components. This mechanism appends meta-information tomessages between components capturing the context of the data, used to dynamically monitor and guarantee useful properties of the dynamic applications. These include the so-called glitch freedom, time synchronisation, and geographical proximity. We formalise Quarp using a simple operational semantics, provide concrete examples of useful instances of contexts, and situate our approach in the realm of distributed reactive programming.

FecharLer Abstract

2016

The problem with embedded CRDT counters and a solution

Autores
Baquero, C; Almeida, PS; Lerche, C;

Publicação
PROCEEDINGS OF THE 2ND WORKSHOP ON THE PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA, PAPOC 2016

Abstract
Conflict-free Replicated Data Types (CRDTs) can simplify the design of deterministic eventual consistency. Considering the several CRDTs that have been deployed in production systems, counters are among the first. Counters are apparently simple, with a straightforward inc/dec/read API, but can require complex implementations and several variants have been specified and coded. Unlike sets and registers, that can be adapted to operate inside maps, current counter approaches exhibit anomalies when embedded in maps. Here, we illustrate the anomaly and propose a solution, based on a new counter model and implementation.

FecharLer Abstract

2017

Transparent cross-system consistency

Autores
Loff, J; Porto, D; Baquero, C; Garcia, J; Preguica, N; Rodrigues, R;

Publicação
PROCEEDINGS OF THE 3RD INTERNATIONAL WORKSHOP ON PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA (PAPOC 17)

Abstract
This paper discusses the motivation and the challenges for providing a systematic and transparent approach for dealing with cross-system consistency. Our high level goal is to provide a way to avoid violations of causality when multiple systems interact, while (a) avoiding the redesign of existing systems, (b) minimizing the overhead, and (c) requiring as little developer input as possible.

FecharLer Abstract