Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2024

A Legal Framework for Natural Language Processing Model Training in Portugal

Authors
Almeida, R; Amorim, E;

Publication
Legal and Ethical Issues in Human Language Technologies 2024, LEGAL 2024 at LREC-COLING 2024 - Workshop Proceedings

Abstract
Recent advances in deep learning have promoted the advent of many computational systems capable of performing intelligent actions that, until then, were restricted to the human intellect. In the particular case of human languages, these advances allowed the introduction of applications like ChatGPT that are capable of generating coherent text without being explicitly programmed to do so. Instead, these models use large volumes of textual data to learn meaningful representations of human languages. Associated with these advances, concerns about copyright and data privacy infringements caused by these applications have emerged. Despite these concerns, the pace at which new natural language processing applications continued to be developed largely outperformed the introduction of new regulations. Today, communication barriers between legal experts and computer scientists motivate many unintentional legal infringements during the development of such applications. In this paper, a multidisciplinary team intends to bridge this communication gap and promote more compliant Portuguese NLP research by presenting a series of everyday NLP use cases, while highlighting the Portuguese legislation that may arise during its development. © 2024 ELRA Language Resource Association.

2024

KDBI special issue: Time-series pattern verification in CNC turning-A comparative study of one-class and binary classification

Authors
da Silva, JP; Nogueira, AR; Pinto, J; Curral, M; Alves, AC; Sousa, R;

Publication
EXPERT SYSTEMS

Abstract
Integrating Industry 4.0 and Quality 4.0 optimises manufacturing through IoT and ML, improving processes and product quality. The primary challenge involves identifying patterns in computer numerical control (CNC) machining time-series data to boost manufacturing quality control. The proposed solution involves an experimental study comparing one-class and binary classification algorithms. This study aims to classify time-series data from CNC turning machines, offering insight into monitoring and adjusting tool wear to maintain product quality. The methodology entails extracting spectral features from time-series data to train both one-class and binary classification algorithms, assessing their effectiveness and computational efficiency. Although certain models consistently outperform others, determining the best performing is not possible, as a trade-off between classification and computational performance is observed, with gradient boosting standing out for effectively balancing both aspects. Thus, the choice between one-class and binary classification ultimately relies on dataset's features and task objectives.

2024

Multilayer quantile graph for multivariate time series analysis and dimensionality reduction

Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
In recent years, there has been a surge in the prevalence of high- and multidimensional temporal data across various scientific disciplines. These datasets are characterized by their vast size and challenging potential for analysis. Such data typically exhibit serial and cross-dependency and possess high dimensionality, thereby introducing additional complexities to conventional time series analysis methods. To address these challenges, a recent and complementary approach has emerged, known as network-based analysis methods for multivariate time series. In univariate settings, quantile graphs have been employed to capture temporal transition properties and reduce data dimensionality by mapping observations to a smaller set of sample quantiles. To confront the increasingly prominent issue of high dimensionality, we propose an extension of quantile graphs into a multivariate variant, which we term Multilayer Quantile Graphs. In this innovative mapping, each time series is transformed into a quantile graph, and inter-layer connections are established to link contemporaneous quantiles of pairwise series. This enables the analysis of dynamic transitions across multiple dimensions. In this study, we demonstrate the effectiveness of this new mapping using synthetic and benchmark multivariate time series datasets. We delve into the resulting network's topological structures, extract network features, and employ these features for original dataset analysis. Furthermore, we compare our results with a recent method from the literature. The resulting multilayer network offers a significant reduction in the dimensionality of the original data while capturing serial and cross-dimensional transitions. This approach facilitates the characterization and analysis of large multivariate time series datasets through network analysis techniques.

2024

Predicting macroeconomic indicators from online activity data: A review

Authors
Costa, EA; Silva, ME;

Publication
Statistical Journal of the IAOS

Abstract
Predictors of macroeconomic indicators rely primarily on traditional data sourced from National Statistical Offices. However, new data sources made available from recent technological advancements, namely data from online activities, have the potential to bring about fresh perspectives on monitoring economic activities and enhance the accuracy of forecasting. This paper reviews the literature on predicting macroeconomic indicators, such as the gross domestic product, unemployment rate, consumer price index or private consumption, based on online activity data sourced from Google Trends, Twitter (rebranded to X) and mobile devices. Based on a systematic search of publications indexed on the Web of Science and Scopus databases, the analysis of a final set of 56 publications covers the publication history of the data sources, the methods used to model the data and the predictive accuracy of information from such data sources. The paper also discusses the limitations and challenges of using online activity data for macroeconomic predictions. The review concludes that online activity data can be a valuable source of information for predicting macroeconomic indicators. However, one must consider certain limitations and challenges to improve the models' accuracy and reliability. © 2024 - IOS Press. All rights reserved.

2024

Real-time nowcasting the monthly unemployment rates with daily Google Trends data

Authors
Costa, EA; Silva, ME; Galvao, Ana Beatriz;

Publication
SOCIO-ECONOMIC PLANNING SCIENCES

Abstract
Policymakers often have to make decisions based on incomplete economic data because of the usual delay in publishing official statistics. To circumvent this issue, researchers use data from Google Trends (GT) as an early indicator of economic performance. Such data have emerged in the literature as alternative and complementary predictors of macroeconomic outcomes, such as the unemployment rate, featuring readiness, public availability and no costs. This study deals with extensive daily GT data to develop a framework to nowcast monthly unemployment rates tailored to work with real-time data availability, resorting to Mixed Data Sampling (MIDAS) regressions. Portugal is chosen as a use case for the methodology since extracting GT data requires the selection of culturally dependent keywords. The nowcasting period spans 2019 to 2021, encompassing the time frame in which the coronavirus pandemic initiated. The findings indicate that using daily GT data with MIDAS provides timely and accurate insights into the unemployment rate, especially during the COVID-19 pandemic, showing accuracy gains even when compared to nowcasts obtained from typical monthly GT data via traditional ARMAX models.

2024

Implications of seasonal and daily variation on methane and ammonia emissions from naturally ventilated dairy cattle barns in a Mediterranean climate: A two-year study

Authors
Rodrigues, ARF; Silva, ME; Silva, VF; Maia, MRG; Cabrita, ARJ; Trindade, H; Fonseca, AJM; Pereira, JLS;

Publication
SCIENCE OF THE TOTAL ENVIRONMENT

Abstract
Seasonal and daily variations of gaseous emissions from naturally ventilated dairy cattle barns are important figures for the establishment of effective and specific mitigation plans. The present study aimed to measure methane (CH4) and ammonia (NH3) emissions in three naturally ventilated dairy cattle barns covering the four seasons for two consecutive years. In each barn, air samples from five indoor locations were drawn by a multipoint sampler to a photoacoustic infrared multigas monitor, along with temperature and relative humidity. Milk production data were also recorded. Results showed seasonal differences for CH4 and NH3 emissions in the three barns with no clear trends within years. Globally, diel CH4 emissions increased in the daytime with high intra-hour variability. The average hourly CH4 emissions (g h-1 livestock unit- 1 (LU)) varied from 8.1 to 11.2 and 6.2 to 20.3 in the dairy barn 1, from 10.1 to 31.4 and 10.9 to 22.8 in the dairy barn 2, and from 1.5 to 8.2 and 13.1 to 22.1 in the dairy barn 3, respectively, in years 1 and 2. Diel NH3 emissions highly varied within hours and increased in the daytime. The average hourly NH3 emissions (g h-1 LU-1) varied from 0.78 to 1.56 and 0.50 to 1.38 in the dairy barn 1, from 1.04 to 3.40 and 0.93 to 1.98 in the dairy barn 2, and from 0.66 to 1.32 and 1.67 to 1.73 in the dairy barn 3, respectively, in years 1 and 2. Moreover, the emission factors of CH4 and NH3 were 309.5 and 30.6 (g day- 1 LU-1), respectively, for naturally ventilated dairy cattle barns. Overall, this study provided a detailed characterization of seasonal and daily gaseous emissions variations highlighting the need for future longitudinal emission studies and identifying an opportunity to better adequate the existing mitigation strategies according to season and daytime.

  • 28
  • 466