Publications

2025

Online boxplot derived outlier detection

Authors
Mazarei, A; Sousa, R; Mendes Moreira, J; Molchanov, S; Ferreira, HM;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Outlier detection is a widely used technique for identifying anomalous or exceptional events across various contexts. It has proven to be valuable in applications like fault detection, fraud detection, and real-time monitoring systems. Detecting outliers in real time is crucial in several industries, such as financial fraud detection and quality control in manufacturing processes. In the context of big data, the amount of data generated is enormous, and traditional batch mode methods are not practical since the entire dataset is not available. The limited computational resources further compound this issue. Boxplot is a widely used batch mode algorithm for outlier detection that involves several derivations. However, the lack of an incremental closed form for statistical calculations during boxplot construction poses considerable challenges for its application within the realm of big data. We propose an incremental/online version of the boxplot algorithm to address these challenges. Our proposed algorithm is based on an approximation approach that involves numerical integration of the histogram and calculation of the cumulative distribution function. This approach is independent of the dataset's distribution, making it effective for all types of distributions, whether skewed or not. To assess the efficacy of the proposed algorithm, we conducted tests using simulated datasets featuring varying degrees of skewness. Additionally, we applied the algorithm to a real-world dataset concerning software fault detection, which posed a considerable challenge. The experimental results underscored the robust performance of our proposed algorithm, highlighting its efficacy comparable to batch mode methods that access the entire dataset. Our online boxplot method, leveraging dataset distribution to define whiskers, consistently achieved exceptional outlier detection results. Notably, our algorithm demonstrated computational efficiency, maintaining constant memory usage with minimal hyperparameter tuning.

CloseRead Abstract

2025

Modelling sustainability in cyber-physical systems: A systematic mapping study

Authors
Barisic, A; Cunha, J; Ruchkin, I; Moreira, A; Araújo, J; Challenger, M; Savic, D; Amaral, V;

Publication
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS

Abstract
Supporting sustainability through modelling and analysis has become an active area of research in Software Engineering. Therefore, it is important and timely to survey the current state of the art in sustainability in Cyber-Physical Systems (CPS), one of the most rapidly evolving classes of complex software systems. This work presents the findings of a Systematic Mapping Study (SMS) that aims to identify key primary studies reporting on CPS modelling approaches that address sustainability over the last 10 years. Our literature search retrieved 2209 papers, of which 104 primary studies were deemed relevant fora detailed characterisation. These studies were analysed based on nine research questions designed to extract information on sustainability attributes, methods, models/meta-models, metrics, processes, and tools used to improve the sustainability of CPS. These questions also aimed to gather data on domain-specific modelling approaches and relevant application domains. The final results report findings for each of our questions, highlight interesting correlations among them, and identify literature gaps worth investigating in the near future.

CloseRead Abstract

2025

Automated Social Media Feedback Analysis for Software Requirements Elicitation: A Case Study in the Streaming Industry

Authors
Silva, M; Faria, JP;

Publication
Proceedings of the 20th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2025, Porto, Portugal, April 4-6, 2025.

Abstract

2025

Exploring Interactivity and Interpassivity in Digital Narratives: A Critical Examination

Authors
Monteiro, AC; Carvalhais, M; Torres, R;

Publication
ADVANCES IN DESIGN, MUSIC AND ARTS III, EIMAD 2024, VOL 1

Abstract
The interaction between code and language shapes emergence and innovation in computational systems, turning them not merely into a series of connected structures but into narrative spaces. Interactive Digital Narratives (IDNs) are characterized by a tension between the control exerted by the system to engage readers and the autonomy that readers desire over the narrative's direction. This results in a ludic paradox, where the role of the narrative system is to enable and facilitate play while simultaneously being capable of communicating the outcomes of the readers' actions. On the other hand, the reader must be able to participate actively by playing along the system's rules. Based on the notion of interpassivity, which refers to the delegation of the cognitive activity to the object, thus transforming the reader into a passive observer of the system's interactions, this paper aims to explore the interplay between interpassivity and interactivity. As we navigate IDNs, we engage with narratives that challenge and empower readers, that create immersive and enriching experiences, and transform their relationships with the computational system. This contributes to understanding the pleasure of playing and the reader's role. Based on the premise that readers can derive pleasure from automation but also yearn for control over the narrative, we can investigate the playful interaction between humans and machines. This paper will analyze Emissaries (2015-2017), defined by its creator, Ian Cheng, as a video game that plays itself, and where the reader can seemingly only visualize the work. In this case study, we will look for narrative mechanics and the specificity of the medium in which the IDN is instantiated. We will discuss how the computational system actively shapes the narrative without direct reader input and consequently propose a reconceptualization of the concept of interpassivity and its relationship with interactivity.

CloseRead Abstract

2025

Automatic Generation of Loop Invariants in Dafny with Large Language Models

Authors
Faria, JP; Trigo, E; Abreu, R;

Publication
Fundamentals of Software Engineering - 11th IFIP WG 2.2 International Conference, FSEN 2025, Västerås, Sweden, April 7-8, 2025, Proceedings

Abstract
Recent verification tools aim to make formal verification more accessible for software engineers by automating most of the verification process. However, the manual work and expertise required to write verification helper code, such as loop invariants and auxiliary lemmas and assertions, remains a barrier. This paper explores the use of Large Language Models (LLMs) to automate the generation of loop invariants for programs in Dafny. We tested the approach on a curated dataset of 100 programs in Dafny involving arrays, strings, and numeric types. Using a multimodel approach that combines GPT-4o and Claude 3.5 Sonnet, correct loop invariants (passing the Dafny verifier) were generated at the first attempt for 92% of the programs, and in at most five attempts for 95% of the programs. Additionally, we developed an extension to the Dafny plugin for Visual Studio Code to incorporate automatic loop invariant generation into the IDE. Our work stands out from related approaches by handling a broader class of problems and offering IDE integration. © IFIP International Federation for Information Processing 2025.

CloseRead Abstract

2025

Spatio-Temporal Predictive Modeling Techniques for Different Domains: a Survey

Authors
Kumar, R; Bhanu, M; Mendes-moreira, J; Chandra, J;

Publication
ACM COMPUTING SURVEYS

Abstract
Spatio-temporal prediction tasks play a crucial role in facilitating informed decision-making through anticipatory insights. By accurately predicting future outcomes, the ability to strategize, preemptively address risks, and minimize their potential impact is enhanced. The precision in forecasting spatial and temporal patterns holds significant potential for optimizing resource allocation, land utilization, and infrastructure development. While existing review and survey papers predominantly focus on specific forecasting domains such as intelligent transportation, urban planning, pandemics, disease prediction, climate and weather forecasting, environmental data prediction, and agricultural yield projection, limited attention has been devoted to comprehensive surveys encompassing multiple objects concurrently. This article addresses this gap by comprehensively analyzing techniques employed in traffic, pandemics, disease forecasting, climate and weather prediction, agricultural yield estimation, and environmental data prediction. Furthermore, it elucidates challenges inherent in spatio-temporal forecasting and outlines potential avenues for future research exploration.

CloseRead Abstract

43
4184