Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CEGI

2022

Best Frame Selection to Enhance Training Step Efficiency in Video-Based Human Action Recognition

Autores
Gharahbagh, AA; Hajihashemi, V; Ferreira, MC; Machado, JJM; Tavares, JMRS;

Publicação
APPLIED SCIENCES-BASEL

Abstract
In recent years, with the growth of digital media and modern imaging equipment, the use of video processing algorithms and semantic film and image management has expanded. The usage of different video datasets in training artificial intelligence algorithms is also rapidly expanding in various fields. Due to the high volume of information in a video, its processing is still expensive for most hardware systems, mainly in terms of its required runtime and memory. Hence, the optimal selection of keyframes to minimize redundant information in video processing systems has become noteworthy in facilitating this problem. Eliminating some frames can simultaneously reduce the required computational load, hardware cost, memory and processing time of intelligent video-based systems. Based on the aforementioned reasons, this research proposes a method for selecting keyframes and adaptive cropping input video for human action recognition (HAR) systems. The proposed method combines edge detection, simple difference, adaptive thresholding and 1D and 2D average filter algorithms in a hierarchical method. Some HAR methods are trained with videos processed by the proposed method to assess its efficiency. The results demonstrate that the application of the proposed method increases the accuracy of the HAR system by up to 3% compared to random image selection and cropping methods. Additionally, for most cases, the proposed method reduces the training time of the used machine learning algorithm.

2022

Binaural Acoustic Scene Classification Using Wavelet Scattering, Parallel Ensemble Classifiers and Nonlinear Fusion

Autores
Hajihashemi, V; Gharahbagh, AA; Cruz, PM; Ferreira, MC; Machado, JJM; Tavares, JMRS;

Publicação
SENSORS

Abstract
The analysis of ambient sounds can be very useful when developing sound base intelligent systems. Acoustic scene classification (ASC) is defined as identifying the area of a recorded sound or clip among some predefined scenes. ASC has huge potential to be used in urban sound event classification systems. This research presents a hybrid method that includes a novel mathematical fusion step which aims to tackle the challenges of ASC accuracy and adaptability of current state-of-the-art models. The proposed method uses a stereo signal, two ensemble classifiers (random subspace), and a novel mathematical fusion step. In the proposed method, a stable, invariant signal representation of the stereo signal is built using Wavelet Scattering Transform (WST). For each mono, i.e., left and right, channel, a different random subspace classifier is trained using WST. A novel mathematical formula for fusion step was developed, its parameters being found using a Genetic algorithm. The results on the DCASE 2017 dataset showed that the proposed method has higher classification accuracy (about 95%), pushing the boundaries of existing methods.

2022

Restart: A Route Planner to Encourage the Use of Public Transport Services in a Pandemic Context

Autores
Fulgêncio, R; Ferreira, MC; Abrantes, D; Coimbra, M;

Publicação
Transportation Research Procedia

Abstract
Public transport services play an important role in the mobility of the population in urban centers, allowing a decrease in the number of private vehicles in circulation and contributing to a more sustainable mobility. However, the emergence of the COVID-19 pandemic had a serious impact on the mobility habits of the population, with a substantial reduction in the number of public transport passengers due to the fear of contagion, which raises questions about the future sustainability of cities. Thus, it is essential to restore the confidence of travelers to feel safe and comfortable using public transport services. Taking advantage of the widespread use of mobile technologies, this article intends to propose a route planning system for public transport that meets the needs of passengers in terms of safety and comfort. After a systematic review of the existing literature and a series of focus group sessions, a prototype of the system was developed, and subsequently evaluated by potential users through usability tests. The results obtained are a good indicator of the system's functionality and ease of use. This assessment allowed us to corroborate the potential that the proposed route planning system has in promoting the use of public transport services as a means of mobility.

2022

Traffic State Prediction Using One-Dimensional Convolution Neural Networks and Long Short-Term Memory

Autores
Reza, S; Ferreira, MC; Machado, JJM; Tavares, JMRS;

Publicação
APPLIED SCIENCES-BASEL

Abstract
Traffic prediction is a vitally important keystone of an intelligent transportation system (ITS). It aims to improve travel route selection, reduce overall carbon emissions, mitigate congestion, and enhance safety. However, efficiently modelling traffic flow is challenging due to its dynamic and non-linear behaviour. With the availability of a vast number of data samples, deep neural network-based models are best suited to solve these challenges. However, conventional network-based models lack robustness and accuracy because of their incapability to capture traffic's spatial and temporal correlations. Besides, they usually require data from adjacent roads to achieve accurate predictions. Hence, this article presents a one-dimensional (1D) convolution neural network (CNN) and long short-term memory (LSTM)-based traffic state prediction model, which was evaluated using the Zenodo and PeMS datasets. The model used three stacked layers of 1D CNN, and LSTM with a logarithmic hyperbolic cosine loss function. The 1D CNN layers extract the features from the data, and the goodness of the LSTM is used to remember the past events to leverage them for the learnt features for traffic state prediction. A comparative performance analysis of the proposed model against support vector regression, standard LSTM, gated recurrent units (GRUs), and CNN and GRU-based models under the same conditions is also presented. The results demonstrate very encouraging performance of the proposed model, improving the mean absolute error, root mean squared error, mean percentage absolute error, and coefficient of determination scores by a mean of 16.97%, 52.1%, 54.15%, and 7.87%, respectively, relative to the baselines under comparison.

2022

A multi-head attention-based transformer model for traffic flow forecasting with a comparative analysis to recurrent neural networks

Autores
Reza, S; Ferreira, MC; Machado, JJM; Tavares, JMRS;

Publicação
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Traffic flow forecasting is an essential component of an intelligent transportation system to mitigate congestion. Recurrent neural networks, particularly gated recurrent units and long short-term memory, have been the stateof-the-art traffic flow forecasting models for the last few years. However, a more sophisticated and resilient model is necessary to effectively acquire long-range correlations in the time-series data sequence under analysis. The dominant performance of transformers by overcoming the drawbacks of recurrent neural networks in natural language processing might tackle this need and lead to successful time-series forecasting. This article presents a multi-head attention based transformer model for traffic flow forecasting with a comparative analysis between a gated recurrent unit and a long-short term memory-based model on PeMS dataset in this context. The model uses 5 heads with 5 identical layers of encoder and decoder and relies on Square Subsequent Masking techniques. The results demonstrate the promising performance of the transform-based model in predicting long-term traffic flow patterns effectively after feeding it with substantial amount of data. It also demonstrates its worthiness by increasing the mean squared errors and mean absolute percentage errors by (1.25 - 47.8)% and (32.4 - 83.8)%, respectively, concerning the current baselines.

2022

A customized residual neural network and bi-directional gated recurrent unit-based automatic speech recognition model

Autores
Reza, S; Ferreira, MC; Machado, JJM; Tavares, JMRS;

Publicação
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Speech recognition aims to convert human speech into text and has applications in security, healthcare, commerce, automobiles, and technology, just to name a few. Inserting residual neural networks before recurrent neural network cells improves accuracy and cuts training time by a good margin. Furthermore, layer normalization instead of batch normalization is more effective in model training and performance enhancement. Also, the size of the datasets presents tremendous influences in achieving the best performance. Leveraging these tricks, this article proposes an automatic speech recognition model with a stacked five layers of customized Residual Convolution Neural Network and seven layers of Bi-Directional Gated Recurrent Units, including a logarithmic so f tmax for the model output. Each of them incorporates a learnable per-element affine parameter-based layer normalization technique. The training and testing of the new model were conducted on the LibriSpeech corpus and LJ Speech dataset. The experimental results demonstrate a character error rate (CER) of 4.7 and 3.61% on the two datasets, respectively, with only 33 million parameters without the requirement of any external language model.

  • 45
  • 186