Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

2026

Turning web data into official statistics: Classifying Portuguese retail products with NLP models

Authors
Machado, JDU; Veloso, B;

Publication
STATISTICAL JOURNAL OF THE IAOS

Abstract
The growing availability of online data creates new opportunities to improve the timeliness and detail of official statistics, particularly in domains such as price monitoring and inflation measurement. However, leveraging web-scraped data for official use requires alignment with standardized classification frameworks such as the European Classification of Individual Consumption According to Purpose (ECOICOP). We train two natural-language models, a lightweight convolutional neural network (CNN) and a fine-tuned BERTimbau transformer, to classify Portuguese food and beverage items into ECOICOP categories. Using 100,000 product titles scraped from six national supermarket sites and labeled via a human-in-the-loop workflow, the CNN reaches a macro-F1 of 92.19 % with minimal computing cost, while the transformer attains 94.00 %, the first such result for Portuguese. Both models are published on Hugging Face, enabling reproducible inference at scale while the source data remain confidential. The study delivers the first open-source Portuguese ECOICOP classifiers for food and beverage products, a replicable low-resource labeling workflow, and a benchmark of accuracy-speed trade-offs to guide researchers in similar tasks.

2026

Students’ perspective on satisfaction of a higher education institution: A preliminary statistical approach

Authors
Ana Catarina Fernandes; Manuel José Fonseca; Jorge Esparteiro Garcia; Helena Sofia Rodrigues;

Publication
AIP conference proceedings

Abstract

2026

Economic benchmarking of assisted pollination methods for kiwifruit flowers: Assessment of cost-effectiveness of robotic solution

Authors
Pinheiro, I; Moura, P; Rodrigues, L; Pacheco, AP; Teixeira, J; Valente, A; Cunha, M; Dos Santos, FN;

Publication
AGRICULTURAL SYSTEMS

Abstract
In 2023, global kiwifruit production reached over 4.4 million tonnes, highlighting the crop's significant economic importance. However, achieving high yields depends on adequate pollination. In Actinidia species, pollen is transferred by insects from male to female flowers on separate plants. Natural pollination faces increasing challenges due to the decline in pollinator populations and climate variability, driving the adoption of assisted pollination methods. This study examines the Portuguese kiwifruit sector, one of the world's top 12 producers, using a novel mixed-methods approach that integrates both qualitative and quantitative analyses to assess the feasibility of robotic pollination. The qualitative study identifies the benefits and challenges of current methods and explores how robotic pollination could address these challenges. The quantitative analysis explores the cost-effectiveness and practicality of implementing robotic pollination as a product and service. Findings indicate that most farmers use handheld pollination devices but face pollen wastage and application timing challenges. Economic analysis establishes a break-even point of & euro;685 per hectare for an annual single application, with a first robotic pollination of & euro;17 146 becoming cost-effective for orchards of at least 3.5 hectares and a second robotic solution of & euro;34 293 becoming cost-effective for orchards up to 7 hectares. A robotic pollination service priced at & euro;685 per hectare per application presents a low-risk and aviable alternative for growers. This study provides robust economic insights supporting the adoption of robotic pollination technologies. This study is crucial to make informed decisions to enhance kiwifruit production's productivity and sustainability through precise robotic-assisted pollination.

2026

Connecting the dots: Logistics, blue, and circular economy in fishing activities

Authors
Alves, W; Gomes, A; Garcia, J;

Publication
New Economics for Sustainability

Abstract

2026

ChatBot for student service based on RASA framework

Authors
Rodrigues, F; Fonseca, J;

Publication
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
The limited in-person availability of administrative services at higher education institutions can delay the resolution of student queries and reduce satisfaction levels. To address this issue, we developed a conversational agent capable of understanding and responding to student questions in Portuguese using natural language processing and machine learning techniques. To enable non-technical management of the agent's knowledge base, a web-based service was implemented, allowing staff to update content and trigger model retraining. The system was evaluated by comparing multiple learning models, with the best performance achieved using Google's BERT language model combined with the DIET classifier, yielding an F1-score of 0.965. In a real-world deployment involving 256 questions, the chatbot achieved approximately 70% accuracy and received an average user satisfaction rating of 4.20 on a 0-5 scale. These results demonstrate the effectiveness of the proposed solution for improving accessibility and efficiency in academic student services.

2026

Online Data Augmentation for Forecasting with Deep Learning

Authors
Cerqueira, V; Santos, M; Roque, L; Baghoussi, Y; Soares, C;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2025, PT I

Abstract
Deep learning approaches are increasingly used to tackle forecasting tasks but require substantial training data. When samples are limited, synthetic data generation techniques can effectively augment datasets to improve model performance. Data augmentation is typically applied offline before training a model. However, when training with mini-batches, some batches may contain a disproportionate number of synthetic samples that do not align well with the original data characteristics. This work introduces an online data augmentation framework that generates synthetic samples during the training of neural networks. By creating synthetic samples for each batch alongside their original counterparts, we maintain a balanced representation between real and synthetic data throughout the training process. This approach fits naturally with the iterative nature of neural network training and eliminates the need to store large augmented datasets. We validated the proposed framework using 3797 time series from 6 benchmark datasets, three neural architectures, and seven synthetic data generation techniques. The experiments suggest that online data augmentation leads to better forecasting performance compared to offline data augmentation or no augmentation approaches. The framework and experiments are publicly available.

  • 60
  • 4479