2007
Autores
Almeida, PS; Baquero, C; Fonte, V;
Publicação
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2007: OTM 2007 WORKSHOPS, PT 2, PROCEEDINGS
Abstract
Optimistic distributed systems often rely on version vectors or their variants in order to track updates on replicated objects. Some of these mechanisms rely on some form of global configuration or distributed naming protocol in order to assign unique identifiers to each replica. These approaches are incompatible with replica creation under arbitrary partitions, a typical operation mode in mobile or poorly connected environments. Other mechanisms assign unique identifiers relying on statistical correctness. In previous work we have introduced an update tracking mechanism that overcomes these limitations. This paper presents results from recent experimentation, that brought to surface a particular pattern of operation that results in an unforeseen, unlimited growth in space consumption. We also describe informally a new update tracking mechanism that does not exhibit this pathological growth while providing guaranteed unique identifiers for a dynamic number of replicas under arbitrary partitions and the same functionality of version vectors.
2007
Autores
Lopes, N; Baquero, C;
Publicação
NETWORK-BASED INFORMATION SYSTEMS, PROCEEDINGS
Abstract
Range queries, retrieving all keys within a given range, is an important add-on for Distributed Hash Tables (DHTs), as they rely only on exact key matching lookup. In this paper we support range queries through a balanced tree algorithm, Decentralized Balanced Tree, that runs over any DHT system. Our algorithm is based on the B(+)-tree design that efficiently stores clustered data while maintaining a balanced load on hosts. The internal structure of the balanced tree is suited for range queries operations over many data distributions since it easily handles clustered data without losing performance. We analyzed, and evaluated our algorithm under a simulated environment, to show it's operation scalability for both insertions and queries. We will show that the system design. imposes a fixed penalty over the DHT access cost, and thus inherits the scalability properties of the chosen underlying DHT.
2007
Autores
Almeida, PS; Baquero, C; Preguica, N; Hutchison, D;
Publicação
INFORMATION PROCESSING LETTERS
Abstract
Bloom filters provide space-efficient storage of sets at the cost of a probability of false positives on membership queries. The size of the filter must be defined a priori based on the number of elements to store and the desired false positive probability, being impossible to store extra elements without increasing the false positive probability. This leads typically to a conservative assumption regarding maximum set size, possibly by orders of magnitude, and a consequent space waste. This paper proposes Scalable Bloom Filters, a variant of Bloom filters that can adapt dynamically to the number of elements stored, while assuring a maximum false positive probability.
2008
Autores
Almeida, PS; Baquero, C; Fonte, V;
Publicação
PRINCIPLES OF DISTRIBUTED SYSTEMS, 12TH INTERNATIONAL CONFERENCE, OPODIS 2008
Abstract
Causality tracking mechanisms, such as vector clocks and version vectors, rely on mappings from globally unique identifiers to integer counters. In a system with a well known set of entities these ids can be preconfigured and given distinct positions in a vector or distinct names in a mapping. Id management is more problematic in dynamic systems, with large and highly variable number of entities, being worsened when network partitions occur. Present solutions for causality tracking are not appropriate to these increasingly common scenarios. In this paper we introduce Interval Tree Clocks, a novel causality tracking mechanism that can be used in scenarios with a dynamic number of entities, allowing a completely decentralized creation of processes/replicas without need for global identifiers or global coordination. The mechanism has a variable size representation that adapts automatically to the number of existing entities, growing or shrinking appropriately. The representation is so compact that the mechanism can even be considered for scenarios with a fixed number of entities, which makes it a general substitute for vector clocks and version vectors.
2009
Autores
Sousa, P; Preguica, N; Baquero, C;
Publicação
GROUPWARE-DESIGN: IMPLEMENTATION, AND USE, PROCEEDINGS
Abstract
Intensive research and development has been conducted in the design and creation of groupware systems for distributed users. While for some activities, these groupware tools are widely used, for other activities the impact in the groupware community has been smaller and can be improved. One reason for this fact is that the mostly common used applications do not support collaborative features and users are reluctant to change to a different application. In this paper we discuss how available file system mechanisms can help to address this problem. In this context, we present Forby, a system that allows to provide groupware features to distributed users by combining filesystem monitoring and distributed event dissemination. To demonstrate our solution, we present three systems that rely on Forby for providing groupware features to users running unmodified applications.
2009
Autores
Jesus, P; Baquero, C; Almeida, PS;
Publicação
DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS, PROCESSINGS
Abstract
Data aggregation plays an important role in the design of scalable systems, allowing the determination of meaningful system-wide properties to direct the execution of distributed applications. In the particular case of wireless sensor networks, data collection is often only practicable if aggregation is performed. Several aggregation algorithms have been proposed in the last few years, exhibiting different properties in terms of accuracy, speed and communication tradeoffs. Nonetheless, existing approaches are found lacking in terms of fault tolerance. In this paper, we introduce a novel fault-tolerant averaging based data aggregation algorithm. It tolerates substantial message loss (link failures), while competing algorithms in the same class can be affected by a Single lost message. The algorithm is based on manipulating flows (in the graph theoretical sense), that are updated using idempotent messages, providing it with unique robustness capabilities. Furthermore, evaluation results obtained by comparing it with other averaging approaches have revealed that it outperforms them in terms of time and message complexity.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.