Publications

Publications by CSE

2014

Improvements to Efficient Retrieval of Very Large Temporal Datasets with the TravelLight Method

Authors
de Carvalho, AV; Oliveira, MA; Rocha, A;

Publication
PROCEEDINGS OF THE 2014 9TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2014)

Abstract
A considerable number of domains deal with large and complex volumes of temporal data. The management of these volumes, from capture, storage, search, transfer, analysis and visualization, still provides interesting challenges. One critical task is the efficient retrieval of data (raw data or intermediate results from analytic tools). Previous work proposed the TravelLight method which reduced the turnaround time and improved interactive retrieval of data from large temporal datasets by exploring the temporal consistency of records in a database. In this work we propose improvements to the method by adopting a new paradigm focused in the management of time intervals instead of solely in data items. A major advantage of this paradigm shift is to enable the separation of the method implementation from any particular temporal data source, as it is autonomous and efficient in the management of retrieved data. Our work demonstrates that the overheads introduced by the new paradigm are smaller than prior overall overheads, further reducing the turnaround time. Reported results concern experiments with a temporally linear navigation across two datasets of one million items. With the obtained results it is possible to conclude that the improvements presented in this work further reduce turnaround time thus enhancing the response of interactive tasks over very large temporal datasets.

CloseRead Abstract

2013

Ensemble - an E-Learning Framework

Authors
Queiros, R; Leal, JP;

Publication
JOURNAL OF UNIVERSAL COMPUTER SCIENCE

Abstract
E-Learning frameworks are conceptual tools to organize networks of e-learning services. Most frameworks cover areas that go beyond the scope of e-learning, from course to financial management, and neglects the typical activities in everyday life of teachers and students at schools such as the creation, delivery, resolution and evaluation of assignments. This paper presents the Ensemble framework - an e-learning framework exclusively focused on the teaching-learning process through the coordination of pedagogical services. The framework presents an abstract data, integration and evaluation model based on content and communications specifications. These specifications must base the implementation of networks in specialized domains with complex evaluations. In this paper we specialize the framework for two domains with complex evaluation: computer programming and computer-aided design (CAD). For each domain we highlight two Ensemble hotspots: data and evaluations procedures. In the former we formally describe the exercise and present possible extensions. In the latter, we describe the automatic evaluation procedures.

CloseRead Abstract

2013

Making Programming Exercises Interoperable with PExIL

Authors
Queiros, R; Leal, JP;

Publication
INNOVATIONS IN XML APPLICATIONS AND METADATA MANAGEMENT: ADVANCING TECHNOLOGIES

Abstract
Several standards have appeared in recent years to formalize the metadata of learning objects, but they are still insufficient to fully describe a specialized domain. In particular, the programming exercise domain requires interdependent resources (e. g. test cases, solution programs, exercise description) usually processed by different services in the programming exercise lifecycle. Moreover, the manual creation of these resources is time-consuming and error-prone, leading to an obstacle to the fast development of programming exercises of good quality. This chapter focuses on the definition of an XML dialect called PExIL (Programming Exercises Interoperability Language). The aim of PExIL is to consolidate all the data required in the programming exercise lifecycle from when it is created to when it is graded, covering also the resolution, the evaluation, and the feedback. The authors introduce the XML Schema used to formalize the relevant data of the programming exercise lifecycle. The validation of this approach is made through the evaluation of the usefulness and expressiveness of the PExIL definition. In the former, the authors present the tools that consume the PExIL definition to automatically generate the specialized resources. In the latter, they use the PExIL definition to capture all the constraints of a set of programming exercises stored in a learning objects repository. Copyright (C) 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

CloseRead Abstract

2013

Towards an accurate evaluation of deduplicated storage systems

Authors
Paulo, J; Reis, P; Pereira, J; Sousa, A;

Publication
COMPUTER SYSTEMS SCIENCE AND ENGINEERING

Abstract
Deduplication has proven to be a valuable technique for eliminating duplicate data in backup and archival systems and is now being applied to new storage environments with distinct requirements and performance trade-offs. Namely, deduplication system are now targeting large-scale cloud computing storage infrastructures holding unprecedented data volumes with a significant share of duplicate content. It is however hard to assess the usefulness of deduplication in particular settings and what techniques provide the best results. In fact, existing disk I/O benchmarks follow simplistic approaches for generating data content leading to unrealistic amounts of duplicates that do not evaluate deduplication systems accurately. Moreover, deduplication systems are now targeting heterogeneous storage environments, with specific duplication ratios, that benchmarks must also simulate. We address these issues with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, DEDISbench generates content by following realistic duplicate content distributions extracted from real datasets. Then, as a second contribution, we analyze and extract the duplicates found on three real storage systems, proving that DEDISbench can easily simulate several workloads. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. Our results lead to novel insight on the performance of these file systems.

CloseRead Abstract

2013

Retrieval of Very Large Temporal Datasets for Interactive Tasks

Authors
de Carvalho, AV; Oliveira, MA; Rocha, A;

Publication
PROCEEDINGS OF THE 2013 8TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI 2013)

Abstract
Many tasks dealing with temporal data, such as interactive browse through temporal datasets, require intensive retrieval from the database. Depending on the user's task, the data retrieved may be too large to fit in the local memory. Even if it fits, the time taken to retrieve the data may compromise user interaction. This work proposes a method, TravelLight, which improves interactive traveling across very large temporal datasets by exploring the temporal consistency of data items. The proposed method consists of two algorithms: the data retrieval and the memory management algorithm, both contributing to improve memory usage and, most important, to reduce the turnaround time. Results are reported concerning experiments with a temporally linear navigation across two datasets of one million items, which differ in the average time span of items. With the obtained results it is possible to conclude that the proposed method reduces turnaround time thus enhancing the response of interactive tasks over very large temporal datasets.

CloseRead Abstract

2013

Composing Least-change Lenses

Authors
Macedo, N; Pacheco, H; Cunha, A; Oliveira, JN;

Publication
ECEASST

Abstract
Non-trivial bidirectional transformations (BXs) are inherently ambiguous, as there are in general many different ways to consistently translate an update from one side to the other. Existing BX languages and frameworks typically satisfy fundamental first principles which ensure acceptable and stable (well-behaved) translation. Unfortunately, these give little insight about how a particular update translation is chosen among the myriad possible. From the user perspective, such unpredictability may hinder the adoption of BX frameworks. The problem can be remedied by imposing a "principle of least change" which, in a state-based framework, amounts to translating each update in a way such that its result is as close as possible to the original state, according to some distance measure. Starting by formalizing such BXs focusing on the particular framework of lenses, this paper discusses whether such least-change lenses can be defined by composition, an essential construct of BX frameworks. For sequential composition, two (dual) update translation alternatives are presented: a classical deterministic one and a nondeterministic. A key ingredient of the approach is the elegant formalization of the main concepts in relation algebra, which exposes several similarities and dualities. © Bidirectional Transformations 2013.

CloseRead Abstract