Publicacoes - INESC TEC

Publicações

Publicações por Luís Pimentel Trigo

2015

Retrieval, visualization and validation of affinities between documents

Autores
Trigo, L; Víta, M; Sarmento, R; Brazdil, P;

Publicação
IC3K 2015 - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Abstract
We present an Information Retrieval tool that facilitates the task of the user when searching for a particular information that is of interest to him. Our system processes a given set of documents to produce a graph, where nodes represent documents and links the similarities. The aim is to offer the user a tool to navigate in this space in an easy way. It is possible to collapse/expand nodes. Our case study shows affinity groups based on the similarities of text production of researchers. This goes beyond the already established communities revealed by co-authorship. The system characterizes the activity of each author by a set of automatically generated keywords and by membership to a particular affinity group. The importance of each author is highlighted visually by the size of the node corresponding to the number of publications and different measures of centrality. Regarding the validation of the method, we analyse the impact of using different combinations of titles, abstracts and keywords on capturing the similarity between researchers.

FecharLer Abstract

2014

A comprehensive workflow for enhancing business bankruptcy prediction

Autores
Sarmento, R; Trigo, L; Fonseca, L;

Publicação
Integration of Data Mining in Business Intelligence Systems

Abstract
Forecasting enterprise bankruptcy is a critical area for Business Intelligence. It is a major concern for investors and credit institutions on risk analysis. It may also enable the sustainability assessment of critical suppliers and clients, as well as competitors and the business environment. Data Mining may deliver a faster and more precise insight about this issue. Widespread software tools offer a broad spectrum of Artificial Intelligence algorithms and the most difficult task may be the decision of selecting that algorithm. Trying to find an answer for this decision in the relatively large amount of available literature in this area with so many options, advantages, and pitfalls may be as informative as distracting. In this chapter, the authors present an empirical study with a comprehensive Knowledge Discovery and Data Mining (KDD) workflow. The proposed classifier selection automation selects an algorithm that has better prediction performance than the most widely documented in the literature. © 2015, IGI Global.

FecharLer Abstract

2016

Predicting Business Bankruptcy: A Comprehensive Case Study

Autores
Sarmento, R; Trigo, L; Fonseca, L;

Publicação
IJSODIT

Abstract

2021

Towards a Human-AI Hybrid Framework for Inter-Researcher Similarity Detection

Autores
Guimaraes, D; Paulino, D; Correia, A; Trigo, L; Brazdil, P; Paredes, H;

Publicação
PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON HUMAN-MACHINE SYSTEMS (ICHMS)

Abstract
Understanding the intellectual landscape of scientific communities and their collaborations has become an indispensable part of research per se. In this regard, measuring similarities among scientific documents can help researchers to identify groups with similar interests as a basis for strengthening collaboration and university-industry linkages. To this end, we intend to evaluate the performance of hybrid crowd-computing methods in measuring the similarity between document pairs by comparing the results achieved by crowds and artificial intelligence (AI) algorithms. That said, in this paper we designed two types of experiments to illustrate some issues in calculating how similar an automatic solution is to a given ground truth. In the first type of experiments, we created a crowdsourcing campaign consisting of four human intelligence tasks (HITs) in which the participants had to indicate whether or not a set of papers belonged to the same author. The second type involves a set of natural language processing (NLP) processes in which we used the TF-IDF measure and the Bidirectional Encoder Representation from Transformers (BERT) model. The results of the two types of experiments carried out in this study provide preliminary insight into detecting major contributions from human-AI cooperation at similarity calculation in order to achieve better decision support. We believe that in this case decision makers can be better informed about potential collaborators based on content-based insights enhanced by hybrid human-AI mechanisms.

FecharLer Abstract

2022

Comparing Lexical and Usage Frequencies of Palatal Segments in Portuguese

Autores
Trigo, L; Silva, C;

Publicação
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022

Abstract
Palatal consonants in Portuguese are considered complex or marked segments because they are inherently heavy and restricted in terms of their distribution, in relation to other consonants. Moreover, they appear to display differences between themselves, as first language acquisition and creoles' adaptation suggest that /L/ is more complex than /n/. The arguments for complexity are endorsed by some qualitative studies but are still lacking quantitative support. This paper aims at analyzing the phonological restrictiveness of these consonants by comparing their actual frequency in several different corpora, reporting both lexical entries and usage in discourse. In addition to their context-free frequency, we control for their word position and phonetic adjacency. We find that palatals are less frequent than other consonants. However, relative to each other, they do not display proportional lexical and usage frequencies. These results shed new light not only on the representation of /n/ and /L/ but also on the relation between frequency and markedness in language studies.

FecharLer Abstract

2022

Exploring consonant frequency in Sri Lanka Portuguese

Autores
Silva, C; Trigo, L;

Publicação
Proceedings of the Second Workshop on Digital Humanities and Natural Language Processing (2nd DHandNLP 2022) co-located with International Conference on the Computational Processing of Portuguese (PROPOR 2022), Virtual Event, Fortaleza, Brazil, 21st March, 2022.

Abstract
Although phoneme selection is a well-studied subject in contact linguistics, phoneme integration is mostly unexplored. This study aims at assessing phoneme integration by measuring consonant frequency in Sri Lanka Portuguese and Portuguese. For that, we select two large lexical corpora and, take several preparation steps to make the data uniform, consistent and reusable. In terms of integration, we find that the more unconstrained a consonant is concerning its phonotactic patterns, the more frequent it is. We also find that being coronal has a positive impact on integration, whereas being palatal has a negative impact. Moreover, we find that in spite of the apparently random changes in the consonant frequency, consonant classes are robustly transmitted from the lexifier to this creole. Copyright © 2022 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

FecharLer Abstract