Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Carlos Baquero

2019

Conflict-Free Replicated Data Types CRDTs

Authors
Preguiça, NM; Baquero, C; Shapiro, M;

Publication
Encyclopedia of Big Data Technologies.

Abstract

2019

Higher-Order Patterns in Replicated Data Types

Authors
Leijnse, A; Almeida, PS; Baquero, C;

Publication
Proceedings of the 6th Workshop on Principles and Practice of Consistency for Distributed Data, PaPoC@EuroSys 2019, Dresden, Germany, March 25-28, 2019

Abstract
The design of Conflict-free Replicated Data Types traditionally requires implementing new designs from scratch to meet a desired behavior. Although there are composition rules that can guide the process, there has not been a lot of work explaining how existing data types relate to each other, nor work that factors out common patterns. To bring clarity to the field we explain underlying patterns that are common to flags, sets, and registers. The identified patterns are succinct and composable, which gives them the power to explain both current designs and open up the space for new ones. © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.

2019

Efficient Synchronization of State-based CRDTs

Authors
Enes, V; Almeida, PS; Baquero, C; Leitao, J;

Publication
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019)

Abstract
To ensure high availability in large scale distributed systems, Conflict-free Replicated Data Types (CRDTs) relax consistency by allowing immediate query and update operations at the local replica, with no need for remote synchronization. State-based CRDTs synchronize replicas by periodically sending their full state to other replicas, which can become extremely costly as the CRDT state grows. Delta-based CRDTs address this problem by producing small incremental states (deltas) to be used in synchronization instead of the full state. However, current synchronization algorithms for delta-based CRDTs induce redundant wasteful delta propagation, performing worse than expected, and surprisingly, no better than state-based. In this paper we: 1) identify two sources of inefficiency in current synchronization algorithms for delta-based CRDTs; 2) bring the concept of join decomposition to state-based CRDTs; 3) exploit join decompositions to obtain optimal deltas and 4) improve the efficiency of synchronization algorithms; and finally, 5) experimentally evaluate the improved algorithms.

2020

State-Machine Replication for Planet-Scale Systems

Authors
Enes, V; Baquero, C; Rezende, TF; Gotsman, A; Perrin, M; Sutra, P;

Publication
PROCEEDINGS OF THE FIFTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS'20)

Abstract
Online applications now routinely replicate their data at multiple sites around the world. In this paper we present ATLAS, the first state-machine replication protocol tailored for such planet-scale systems. ATLAS does not rely on a distinguished leader, so clients enjoy the same quality of service independently of their geographical locations. Furthermore, clientperceived latency improves as we add sites closer to clients. To achieve this, ATLAS minimizes the size of its quorums using an observation that concurrent data center failures are rare. It also processes a high percentage of accesses in a single round trip, even when these conflict. We experimentally demonstrate that ATLAS consistently outperforms state-of-the-art protocols in planet-scale scenarios. In particular, ATLAS is up to two times faster than Flexible Paxos with identical failure assumptions, and more than doubles the performance of Egalitarian Paxos in the YCSB benchmark.

2021

The CoronaSurveys System for COVID-19 Incidence Data Collection and Processing

Authors
Baquero, C; Casari, P; Anta, AF; Garcia Garcia, A; Frey, D; Garcia Agundez, A; Georgiou, C; Girault, B; Ortega, A; Goessens, M; Hernandez Roig, HA; Nicolaou, N; Stavrakis, E; Ojo, O; Roberts, JC; Sanchez, I;

Publication
FRONTIERS IN COMPUTER SCIENCE

Abstract
CoronaSurveys is an ongoing interdisciplinary project developing a system to infer the incidence of COVID-19 around the world using anonymous open surveys. The surveys have been translated into 60 languages and are continuously collecting participant responses from any country in the world. The responses collected are pre-processed, organized, and stored in a version-controlled repository, which is publicly available to the scientific community. In addition, the CoronaSurveys team has devised several estimates computed on the basis of survey responses and other data, and makes them available on the project's website in the form of tables, as well as interactive plots and maps. In this paper, we describe the computational system developed for the CoronaSurveys project. The system includes multiple components and processes, including the web survey, the mobile apps, the cleaning and aggregation process of the survey responses, the process of storage and publication of the data, the processing of the data and the computation of estimates, and the visualization of the results. In this paper we describe the system architecture and the major challenges we faced in designing and deploying it.

2021

Efficient Replication via Timestamp Stability

Authors
Enes, V; Baquero, C; Gotsman, A; Sutra, P;

Publication
PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21)

Abstract
Modern web applications replicate their data across the globe and require strong consistency guarantees for their most critical data. These guarantees are usually provided via state-machine replication (SMR). Recent advances in SMR have focused on leaderless protocols, which improve the availability and performance of traditional Paxos-based solutions. We propose Tempo - a leaderless SMR protocol that, in comparison to prior solutions, achieves superior throughput and offers predictable performance even in contended workloads. To achieve these benefits, Tempo timestamps each application command and executes it only after the timestamp becomes stable, i.e., all commands with a lower timestamp are known. Both the timestamping and stability detection mechanisms are fully decentralized, thus obviating the need for a leader replica. Our protocol furthermore generalizes to partial replication settings, enabling scalability in highly parallel workloads. We evaluate the protocol in both real and simulated geo-distributed environments and demonstrate that it outperforms state-of-the-art alternatives.

  • 8
  • 19