Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I obtained my BSc and MSc in Computer Science at the Faculty of Science in the University of Porto.

Since 2014 I have been working at INESC TEC mainly on computer vision and I am currently also a PhD student at the Faculty of Engineering in the University of Porto.

My main research goals are related to Computer Vision, also with an emphasis on Machine Learning and Virtual Reality. 

Interest
Topics
Details

Details

  • Name

    Américo José Pereira
  • Role

    Research Assistant
  • Since

    01st February 2014
003
Publications

2023

From a Visual Scene to a Virtual Representation: A Cross-Domain Review

Authors
Pereira, A; Carvalho, P; Pereira, N; Viana, P; Corte-Real, L;

Publication
IEEE ACCESS

Abstract
The widespread use of smartphones and other low-cost equipment as recording devices, the massive growth in bandwidth, and the ever-growing demand for new applications with enhanced capabilities, made visual data a must in several scenarios, including surveillance, sports, retail, entertainment, and intelligent vehicles. Despite significant advances in analyzing and extracting data from images and video, there is a lack of solutions able to analyze and semantically describe the information in the visual scene so that it can be efficiently used and repurposed. Scientific contributions have focused on individual aspects or addressing specific problems and application areas, and no cross-domain solution is available to implement a complete system that enables information passing between cross-cutting algorithms. This paper analyses the problem from an end-to-end perspective, i.e., from the visual scene analysis to the representation of information in a virtual environment, including how the extracted data can be described and stored. A simple processing pipeline is introduced to set up a structure for discussing challenges and opportunities in different steps of the entire process, allowing to identify current gaps in the literature. The work reviews various technologies specifically from the perspective of their applicability to an end-to-end pipeline for scene analysis and synthesis, along with an extensive analysis of datasets for relevant tasks.

2023

Synthesizing Human Activity for Data Generation

Authors
Romero, A; Carvalho, P; Corte-Real, L; Pereira, A;

Publication
JOURNAL OF IMAGING

Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks.

2022

Toward Vehicle Occupant-Invariant Models for Activity Characterization

Authors
Capozzi, L; Barbosa, V; Pinto, C; Pinto, JR; Pereira, A; Carvalho, PM; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
With the advent of self-driving cars and the push by large companies into fully driverless transportation services, monitoring passenger behaviour in vehicles is becoming increasingly important for several reasons, such as ensuring safety and comfort. Although several human action recognition (HAR) methods have been proposed, developing a true HAR system remains a very challenging task. If the dataset used to train a model contains a small number of actors, the model can become biased towards these actors and their unique characteristics. This can cause the model to generalise poorly when confronted with new actors performing the same actions. This limitation is particularly acute when developing models to characterise the activities of vehicle occupants, for which data sets are short and scarce. In this study, we describe and evaluate three different methods that aim to address this actor bias and assess their performance in detecting in-vehicle violence. These methods work by removing specific information about the actor from the model's features during training or by using data that is independent of the actor, such as information about body posture. The experimental results show improvements over the baseline model when evaluated with real data. On the Hanau03 Vito dataset, the accuracy improved from 65.33% to 69.41%. On the Sunnyvale dataset, the accuracy improved from 82.81% to 86.62%.

2022

Boosting color similarity decisions using the CIEDE2000_PF Metric

Authors
Pereira, A; Carvalho, P; Corte Real, L;

Publication
SIGNAL IMAGE AND VIDEO PROCESSING

Abstract
Color comparison is a key aspect in many areas of application, including industrial applications, and different metrics have been proposed. In many applications, this comparison is required to be closely related to human perception of color differences, thus adding complexity to the process. To tackle this, different approaches were proposed through the years, culminating in the CIEDE2000 formulation. In our previous work, we showed that simple color properties could be used to reduce the computational time of a color similarity decision process that employed this metric, which is recognized as having high computational complexity. In this paper, we show mathematically and experimentally that these findings can be adapted and extended to the recently proposed CIEDE2000 PF metric, which has been recommended by the CIE for industrial applications. Moreover, we propose new efficient models that not only achieve lower error rates, but also outperform the results obtained for the CIEDE2000 metric.

2021

Automatic TV Logo Identification for Advertisement Detection without Prior Data

Authors
Carvalho, P; Pereira, A; Viana, P;

Publication
APPLIED SCIENCES-BASEL

Abstract
Advertisements are often inserted in multimedia content, and this is particularly relevant in TV broadcasting as they have a key financial role. In this context, the flexible and efficient processing of TV content to identify advertisement segments is highly desirable as it can benefit different actors, including the broadcaster, the contracting company, and the end user. In this context, detecting the presence of the channel logo has been seen in the state-of-the-art as a good indicator. However, the difficulty of this challenging process increases as less prior data is available to help reduce uncertainty. As a result, the literature proposals that achieve the best results typically rely on prior knowledge or pre-existent databases. This paper proposes a flexible method for processing TV broadcasting content aiming at detecting channel logos, and consequently advertising segments, without using prior data about the channel or content. The final goal is to enable stream segmentation identifying advertisement slices. The proposed method was assessed over available state-of-the-art datasets as well as additional and more challenging stream captures. Results show that the proposed method surpasses the state-of-the-art.