Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Miguel Guimarães concluiu a licenciatura e o mestrado em Engenharia Informática em 2022 na Escola Superior de Tecnologia e Gestão, do Politécnico do Porto. Seu trabalho de mestrado foi desenvolvido na área de Machine Learning (ML). Especificamente, ele aborda a questão da explicabilidade em ML e explora o meta-learning como uma forma de enfrentar alguns dos atuais desafios de ML, nomeadamente com o streaming de dados. Atualmente é Professor Auxiliar Convidado na mesma instituição. Tem uma forte paixão pela investigação, tendo participado como Research Fellow em 2 projetos. Miguel é autor de diversas publicações nas áreas de aprendizagem automática e sistemas híbridos. Publicou 5 artigos em revista, 6 capítulos de livros, 3 trabalhos em conferências e recebeu 2 prémios de melhor artigo.

Atualmente é investigador no INESC TEC, na qual tem trabalhado no desenvolvimento e aplicação de técnicas de inteligência artificial em ambiente industrial, adquirindo assim conhecimento em contexto de trabalho, conciliando a teoria académica com a experiência prática. Recentemente, investiga a aplicação e as desvantagens dos modelos de IA generativa (GAI) na indústria.

Especificamente, concentra-se em ter uma visão centrada no ser humano e no contexto da organização, incorporando fatores sociotécnicos no ciclo de vida da GAI, de modo a adaptar os resultados do modelo ao contexto específico do Utilizador. Ele espera que isso incentive uma adoção mais ampla de GAI, como os modelos de linguagem (LLMs), em ambientes industriais.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Miguel Ângelo Guimarães
  • Cargo

    Assistente de Investigação
  • Desde

    01 junho 2023
Publicações

2024

Supervised and unsupervised techniques in textile quality inspections

Autores
Ferreira, M; Carneiro, R; Guimarães, M; Oliveira, V;

Publicação
Procedia Computer Science

Abstract
Quality inspection is a critical step in ensuring the quality and efficiency of textile production processes. With the increasing complexity and scale of modern textile manufacturing systems, the need for accurate and efficient quality inspection and defect detection techniques has become paramount. This paper compares supervised and unsupervised Machine Learning techniques for defect detection in the context of industrial textile production, in terms of their respective advantages and disadvantages, and their implementation and computational costs. We explore the use of an autoencoder for the detection of defects in textiles. The goal of this preliminary work is to find out if unsupervised methods can successfully train models with good performance without the need for defect labelled data. © 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)

2023

Real-Time Algorithm Recommendation Using Meta-Learning

Autores
Palumbo, G; Guimaraes, M; Carneiro, D; Novais, P; Alves, V;

Publicação
AMBIENT INTELLIGENCE-SOFTWARE AND APPLICATIONS-13TH INTERNATIONAL SYMPOSIUM ON AMBIENT INTELLIGENCE

Abstract
As the field of Machine Learning evolves, the number of available learning algorithms and their parameters continues to grow. On the one hand, this is positive as it allows for the finding of potentially more accurate models. On the other hand, however, it also makes the process of finding the right model more complex, given the number of possible configurations. Traditionally, data scientists rely on trial-and-error or brute force procedures, which are costly, or on their own intuition or expertise, which is hard to acquire. In this paper we propose an approach for algorithm recommendation based on meta-learning. The approach can be used in real-time to predict the best n algorithms (based on a selected performance metric) and their configuration, for a given ML problem. We evaluate it through cross-validation, and by comparing it against an Auto ML approach, in terms of accuracy and time. Results show that the proposed approach recommends algorithms that are similar to those of traditional approaches, in terms of performance, in just a fraction of the time.

2023

Algorithm Recommendation and Performance Prediction Using Meta-Learning

Autores
Palumbo, G; Carneiro, D; Guimares, M; Alves, V; Novais, P;

Publicação
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS

Abstract
In the last years, the number of machine learning algorithms and their parameters has increased significantly. On the one hand, this increases the chances of finding better models. On the other hand, it increases the complexity of the task of training a model, as the search space expands significantly. As the size of datasets also grows, traditional approaches based on extensive search start to become prohibitively expensive in terms of computational resources and time, especially in data streaming scenarios. This paper describes an approach based on meta-learning that tackles two main challenges. The first is to predict key performance indicators of machine learning models. The second is to recommend the best algorithm/configuration for training a model for a given machine learning problem. When compared to a state-of-the-art method (AutoML), the proposed approach is up to 130x faster and only 4% worse in terms of average model quality. Hence, it is especially suited for scenarios in which models need to be updated regularly, such as in streaming scenarios with big data, in which some accuracy can be traded for a much shorter model training time.

2023

The Impact of Data Selection Strategies on Distributed Model Performance

Autores
Guimarães, M; Oliveira, F; Carneiro, D; Novais, P;

Publicação
Lecture Notes in Networks and Systems

Abstract
Distributed Machine Learning, in which data and learning tasks are scattered across a cluster of computers, is one of the answers of the field to the challenges posed by Big Data. Still, in an era in which data abounds, decisions must still be made regarding which specific data to use on the training of the model, either because the amount of available data is simply too large, or because the training time or complexity of the model must be kept low. Typical approaches include, for example, selection based on data freshness. However, old data are not necessarily outdated and might still contain relevant patterns. Likewise, relying only on recent data may significantly decrease data diversity and representativity, and decrease model quality. The goal of this paper is to compare different heuristics for selecting data in a distributed Machine Learning scenario. Specifically, we ascertain whether selecting data based on their characteristics (meta-features), and optimizing for maximum diversity, improves model quality while, eventually, allowing to reduce model complexity. This will allow to develop more informed data selection strategies in distributed settings, in which the criteria are not only the location of the data or the state of each node in the cluster, but also include intrinsic and relevant characteristics of the data. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

2023

Block size, parallelism and predictive performance: finding the sweet spot in distributed learning

Autores
Oliveira, F; Carneiro, D; Guimaraes, M; Oliveira, O; Novais, P;

Publicação
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS

Abstract
As distributed and multi-organization Machine Learning emerges, new challenges must be solved, such as diverse and low-quality data or real-time delivery. In this paper, we use a distributed learning environment to analyze the relationship between block size, parallelism, and predictor quality. Specifically, the goal is to find the optimum block size and the best heuristic to create distributed Ensembles. We evaluated three different heuristics and five block sizes on four publicly available datasets. Results show that using fewer but better base models matches or outperforms a standard Random Forest, and that 32 MB is the best block size.