2022
Authors
Weidner, M; Almeida, PS;
Publication
PAPOC'22: PROCEEDINGS OF THE 9TH PRINCIPLES AND PRACTICE OF CONSISTENCY FOR DISTRIBUTED DATA
Abstract
Embedding CRDT counters has shown to be a challenging topic, since their introduction in Riak Maps. The desire for obliviousness, where all information about a counter is fully removed upon key removal, faces problems due to the possibility of concurrency between increments and key removals. Previous state-based proposals exhibit undesirable reset-wins semantics, which lead to losing increments, unsatisfactorily solved through manual generation management in the API. Previous operation-based approaches depend on causal stability, being prone to unbounded counter growth under network partitions. We introduce a novel embeddable operation-based CRDT counter which achieves both desirable observed-reset semantics and obliviousness upon resets. Moreover, it achieves this while merely requiring FIFO delivery, allowing a tradeoff between causal consistency and faster information propagation, being more robust under network partitions.
2022
Authors
Kassam, Z; Almeida, PS; Shoker, A;
Publication
2022 31ST INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2022)
Abstract
TCP is typically the default transport protocol of choice for its supposed reliability, even for message-oriented middleware (e.g., ZeroMQ) or inter-actor communication (e.g., distributed Erlang). However, under network issues, TCP connections can fail, which requires ensuring both at-least-once and at-most-once delivery at the upper middleware layer. Moreover, the use of TCP at scale, in highly concurrent systems, can lead to drastic performance loss due to the need for TCP connection multiplexing and the resulting head-of-line blocking. This paper introduces Exon, an oblivious exactly-once messaging protocol, and a corresponding lightweight library implementation. Exon uses a novel strategy of a per-message four-way protocol to ensure oblivious exactly-once messaging, with on-demand protocol-level soft half-connections that are established when needed and safely discarded. This achieves correctness, obliviousness, and performance, through merging and pipelining basic protocol messages. The empirical evaluation of Exon demonstrates significant improvements in throughput and latency under packet loss, while maintaining a negligible overhead over TCP in healthy networks.
2023
Authors
Barros, M; Ramos, M; Gomes, A; Cunha, A; Pereira, J; Almeida, PS;
Publication
FORMAL TECHNIQUES FOR DISTRIBUTED OBJECTS, COMPONENTS, AND SYSTEMS, FORTE 2023
Abstract
Automatic grading based on unit tests is a key feature of massive open online courses (MOOC) on programming, as it allows instant feedback to students and enables courses to scale up. This technique works well for sequential programs, by checking outputs against a sample of inputs, but unfortunately it is not adequate for detecting races and deadlocks, which precludes its use for concurrent programming, a key subject in parallel and distributed computing courses. In this paper we provide a hands-on evaluation of verification and testing tools for concurrent programs, collecting a precise set of requirements, and describing to what extent they can or can not be used for this purpose. Our conclusion is that automatic grading of concurrent programming exercises remains an open challenge.
2023
Authors
Rodrigues, A; Shtul, A; Baquero, C; Almeida, PS;
Publication
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023
Abstract
A Bloom Filter is a probabilistic data structure designed to check, rapidly and memory-efficiently, whether an element is present in a set. It has been vastly used in various computing areas and several variants, allowing deletions, dynamic sets and working with sliding windows, have surfaced over the years. When summarizing data streams, it becomes relevant to identify the more recent elements in the stream. However, most of the sliding window schemes consider the most recent items of a data stream without considering time as a factor. While this allows, e.g., storing the most recent 10000 elements, it does not easily translate into storing elements received in the last 60 seconds, unless the insertion rate is stable and known in advance. In this paper, we present the Time-limited Bloom Filter, a new BF-based approach that can save information of a given time period and correctly identify it as present when queried, while also being able to retire data when it becomes stale. The approach supports variable insertion rates while striving to keep a target false positive rate. We also make available a reference implementation of the data structure as a Redis module.
2023
Authors
Almeida, PS;
Publication
IEEE TRANSACTIONS ON COMPUTERS
Abstract
In a partitioned Bloom Filter (PBF) the bit vector is split into disjoint parts, one per hash function. Contrary to hardware designs, where they prevail, software implementations mostly ignore PBFs, considering them worse than standard Bloom filters (SBF), due to the slightly larger false positive rate (FPR). In this paper, by performing an in-depth analysis, first we show that the FPR advantage of SBFs is smaller than thought; more importantly, by deriving the per-element FPR, we show that SBFs have weak spots in the domain: elements that test as false positives much more frequently than expected. This is relevant in scenarios where an element is tested against many filters. Moreover, SBFs are prone to exhibit extremely weak spots if naive double hashing is used, something occurring in mainstream libraries. PBFs exhibit a uniform distribution of the FPR over the domain, with no weak spots, even using naive double hashing. Finally, we survey scenarios beyond set membership testing, identifying many advantages of having disjoint parts, in designs using SIMD techniques, for filter size reduction, test of set disjointness, and duplicate detection in streams. PBFs are better, and should replace SBFs, in general purpose libraries and as the base for novel designs.
1997
Authors
Almeida, PS;
Publication
ECOOP
Abstract
Current data abstraction mechanisms are not adequate to control sharing of state in the general case involving objects in linked structures. The pervading possibility of sharing is a source of errors and an obstacle to language implementation techniques. We present a general extension to programming languages which makes the ability to share state a first class property of a data type, resolving a long-standing flaw in existing data abstraction mechanisms. Balloon types enforce a strong form of encapsulation: no state reachable (directly or transitively) by a balloon object is referenced by any external object. Syntactic simplicity is achieved by relying on a non-trivial static analysis as the checking mechanism. Balloon types are applicable in a wide range of areas such as program transformation, memory management and distributed systems. They are the key to obtaining self-contained composite objects, truly opaque data abstractions and value types-important concepts for the development of large scale, provably correct programs. © Springer-Verlag Berhn Heidelberg 1997.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.