2012
Autores
Paulo, J; Reis, P; Pereira, J; Sousa, A;
Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
Deduplication is widely accepted as an effective technique for eliminating duplicated data in backup and archival systems. Nowadays, deduplication is also becoming appealing in cloud computing, where large-scale virtualized storage infrastructures hold huge data volumes with a significant share of duplicated content. There have thus been several proposals for embedding deduplication in storage appliances and file systems, providing different performance trade-offs while targeting both user and application data, as well as virtual machine images. It is however hard to determine to what extent is deduplication useful in a particular setting and what technique will provide the best results. In fact, existing disk I/O micro-benchmarks are not designed for evaluating deduplication systems, following simplistic approaches for generating data written that lead to unrealistic amounts of duplicates. We address this with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, we introduce the generation of a realistic duplicate distribution based on real datasets. Moreover, DEDISbench also allows simulating access hotspots and different load intensities for I/O operations. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and IOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. As a secondary contribution, our results lead to novel insight on the performance of these file systems. © 2012 Springer-Verlag.
2023
Autores
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;
Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023
Abstract
We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.
2023
Autores
Esteves, T; Macedo, R; Oliveira, R; Paulo, J;
Publicação
CoRR
Abstract
2021
Autores
Esteves, T; Miranda, M; Paulo, J; Portela, B;
Publicação
IACR Cryptol. ePrint Arch.
Abstract
2021
Autores
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;
Publicação
IACR Cryptol. ePrint Arch.
Abstract
2023
Autores
Macedo, R; Miranda, M; Tanimura, Y; Haga, J; Ruhela, A; Harrell, SL; Evans, RT; Pereira, J; Paulo, J;
Publicação
2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID
Abstract
Modern I/O applications that run on HPC infrastructures are increasingly becoming read and metadata intensive. However, having multiple applications submitting large amounts of metadata operations can easily saturate the shared parallel file system's metadata resources, leading to overall performance degradation and I/O unfairness. We present PADLL, an application and file system agnostic storage middleware that enables QoS control of data and metadata workflows in HPC storage systems. It adopts ideas from Software-Defined Storage, building data plane stages that mediate and rate limit POSIX requests submitted to the shared file system, and a control plane that holistically coordinates how all I/O workflows are handled. We demonstrate its performance and feasibility under multiple QoS policies using synthetic benchmarks, real-world applications, and traces collected from a production file system. Results show that PADLL can enforce complex storage QoS policies over concurrent metadata-aggressive jobs, ensuring fairness and prioritization.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.