Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    Paula Raissa Silva
  • Role

    Research Assistant
  • Since

    13th September 2017
001
Publications

2023

A DTW Approach for Complex Data A Case Study with Network Data Streams

Authors
Silva, PR; Vinagre, J; Gama, J;

Publication
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
Dynamic Time Warping (DTW) is a robust method to measure the similarity between two sequences. This paper proposes a method based on DTW to analyse high-speed data streams. The central idea is to decompose the network traffic into sequences of histograms of packet sizes and then calculate the distance between pairs of such sequences using DTW with Kullback-Leibler (KL) distance. As a baseline, we also compute the Euclidean Distance between the sequences of histograms. Since our preliminary experiments indicate that the distance between two sequences falls within a different range of values for distinct types of streams, we then exploit this distance information for stream classification using a Random Forest. The approach was investigated using recent internet traffic data from a telecommunications company. To illustrate the application of our approach, we conducted a case study with encrypted Internet Protocol Television (IPTV) network traffic data. The goal was to use our DTW-based approach to detect the video codec used in the streams, as well as the IPTV channel. Results strongly suggest that the DTW distance value between the data streams is highly informative for such classification tasks.

2023

Towards federated learning: An overview of methods and applications

Authors
Silva, PR; Vinagre, J; Gama, J;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Federated learning (FL) is a collaborative, decentralized privacy-preserving method to attach the challenges of storing data and data privacy. Artificial intelligence, machine learning, smart devices, and deep learning have strongly marked the last years. Two challenges arose in data science as a result. First, the regulation protected the data by creating the General Data Protection Regulation, in which organizations are not allowed to keep or transfer data without the owner's authorization. Another challenge is the large volume of data generated in the era of big data, and keeping that data in one only server becomes increasingly tricky. Therefore, the data is allocated into different locations or generated by devices, creating the need to build models or perform calculations without transferring data to a single location. The new term FL emerged as a sub-area of machine learning that aims to solve the challenge of making distributed models with privacy considerations. This survey starts by describing relevant concepts, definitions, and methods, followed by an in-depth investigation of federated model evaluation. Finally, we discuss three promising applications for further research: anomaly detection, distributed data streams, and graph representation.This article is categorized under:Technologies > Machine LearningTechnologies > Artificial Intelligence

2022

Federated Anomaly Detection over Distributed Data Streams

Authors
Silva, PR; Vinagre, J; Gama, J;

Publication
CoRR

Abstract

2020

Student Research Abstract: Multimodal Deep Learning Based Approach for Cells State Classification

Authors
Silva, PR;

Publication
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20)

Abstract
With the advances of the big data era in biology, deep learning have been incorporated in analysis pipelines trying to transform biological information into valuable knowledge. Deep learning demonstrated its power in promoting bioinformatics field including sequence analysis, bio-molecular property and function prediction, automatic medical diagnosis and to analyse cell imaging data. The ambition of this work is to create an approach that can fully explore the relationships across modalities and subjects through mining and fusing features from multi-modality data for cell state classification. The system should be able to classify cell state through multimodal deep learning techniques using heterogeneous data such as biological images, genomics and clinical annotations. Our pilot study addresses the data acquisition process and the framework capable to extract biological parameters from cell images.

2018

An Approach to Extract Proper Implications Set from High-dimension Formal Contexts using Binary Decision Diagram

Authors
Santos, P; Neves, J; Silva, P; Dias, SM; Zárate, L; Song, M;

Publication
Proceedings of the 20th International Conference on Enterprise Information Systems

Abstract