Paula Brito

O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais

Instituição
Investigação
Domínios de Investigação
Inteligência Artificial

Bioengenharia

Comunicações

Ciência e Engenharia dos Computadores
Fotónica

Sistemas de Energia

Robótica

Engenharia e Gestão de Sistemas
CENTROS DE INVESTIGAÇÃO
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Inovação
Inovação / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Tecnologias Disponíveis
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratórios
Laboratórios de Investigação

iilab
Comunicação
Notícias

Eventos

Media

Boletim Informativo
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Junte-se a nós
Contactos

Home
Pessoas
Paula Brito

Ler apresentação completa

Sou Professora Associada na Faculdade de Economia da Universidade do Porto, onde ensino Estatística e Análise Multivariada de Dados a nível de licenciatura, mestrado e doutoramento, e membro do Laboratório em Inteligência Artificial e Sistemas de Apoio à Decisão (LIAAD) do INESC-TEC. Tenho um doutoramento em Matemática Aplicada da Universidade Paris Dauphine (1991). A minha investigação actual centra-se na análise de dados multidimensionais complexos, usualmente designados por dados simbólicos - dados representado variabilidade inerente aos registos, sob a forma de intervalos ou distribuições - para os quais desenvolvo abordagens estatísticas e metodologias de análise multivariada. De uma forma geral, interesso-me por análise multivariada de dados, com foco na análise classificatória.

Ler apresentação completa

Sobre

A minha investigação actual centra-se na análise de dados multidimensionais complexos, usualmente designados por dados simbólicos - dados representado variabilidade inerente aos registos, sob a forma de intervalos ou distribuições - para os quais desenvolvo abordagens estatísticas e metodologias de análise multivariada. De uma forma geral, interesso-me por análise multivariada de dados, com foco na análise classificatória.

Tópicos
de interesse

Detalhes

Nome
Paula Brito
Cargo
Investigador Coordenador
Desde
01 janeiro 2008

Nacionalidade
Portugal
Centro
Laboratório de Inteligência Artificial e Apoio à Decisão
Contactos
+351220402963
paula.brito@inesctec.pt

001

Publicações

Ler todas as publicações

2025

Parametric models for distributional data

Autores
Brito, P; Silva, APD;

Publicação
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION

Abstract
We present parametric probabilistic models for numerical distributional variables. The proposed models are based on the representation of each distribution by a location measure and inter-quantile ranges, for given quantiles, thereby characterizing the underlying empirical distributions in a flexible way. Multivariate Normal distributions are assumed for the whole set of indicators, considering alternative structures of the variance-covariance matrix. For all cases, maximum likelihood estimators of the corresponding parameters are derived. This modelling allows for hypothesis testing and multivariate parametric analysis. The proposed framework is applied to Analysis of Variance and parametric Discriminant Analysis of distributional data. A simulation study examines the performance of the proposed models in classification problems under different data conditions. Applications to Internet traffic data and Portuguese official data illustrate the relevance of the proposed approach.

FecharLer Abstract

2025

Air Quality Data Analysis with Symbolic Principal Components

Autores
Loureiro, P; Oliveira, M; Brito, P; Oliveira, L;

Publicação
Springer Proceedings in Mathematics and Statistics

Abstract
Air pollution is a global challenge with deep implications in public health and environment. We examine air quality data from a monitoring station in Entrecampos, Lisbon, Portugal, using Symbolic Data Analysis. The dataset consists of hourly concentrations of nine pollutants during three years, which are logarithmically transformed and aggregated in intervals, taking the daily minimum and maximum values. The symbolic mean and variance are estimated for each variable through the method of moments, and the pairwise dependencies are captured using a bivariate copula. Symbolic principal component scores are obtained from the estimated covariance matrix and used to fit generalized extreme value distributions. Outlier maps, based on these distributions’ quantiles, are used to identify outlying observations. A comparative analysis with daily average-based outlier detection methods is conducted. The results show the relevance of Symbolic Data Analysis in revealing new insights into air quality. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

FecharLer Abstract

2024

Community detection in interval-weighted networks

Autores
Alves, H; Brito, P; Campos, P;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
In this paper we introduce and develop the concept of interval-weighted networks (IWN), a novel approach in Social Network Analysis, where the edge weights are represented by closed intervals composed with precise information, comprehending intrinsic variability. We extend IWN for both Newman's modularity and modularity gain and the Louvain algorithm, considering a tabular representation of networks by contingency tables. We apply our methodology to two real-world IWN. The first is a commuter network in mainland Portugal, between the twenty three NUTS 3 Regions (IWCN). The second focuses on annual merchandise trade between 28 European countries, from 2003 to 2015 (IWTN). The optimal partition of geographic locations (regions or countries) is developed and compared using two new different approaches, designated as Classic Louvain and Hybrid Louvain , which allow taking into account the variability observed in the original network, thereby minimizing the loss of information present in the raw data. Our findings suggest the division of the twenty three Portuguese regions in three main communities for the IWCN and between two to three country communities for the IWTN. However, we find different geographical partitions according to the community detection methodology used. This analysis can be useful in many real-world applications, since it takes into account that the weights may vary within the ranges, rather than being constant.

FecharLer Abstract

2024

Anomaly detection-based undersampling for imbalanced classification problems

Autores
Park, YJ; Brito, P; Ma, YC;

Publicação
ENGINEERING OPTIMIZATION

Abstract
In various machine learning applications, classification plays an important role in categorizing and predicting data. To improve the classification performance, it is crucial to identify and remove the anomalies. Also, class imbalance in many machine learning applications is a very common problem since most classifiers tend to be biased toward the majority class by ignoring the minority class instances. Thus, in this research, we propose a new under-sampling technique based on anomaly detection and removal to enhance the performance of imbalanced classification problems. To demonstrate the effectiveness of the proposed method, comprehensive experiments are conducted on forty imbalanced data sets and two non-parametric hypothesis tests are employed to show the statistical difference in classification performances between the proposed method and other traditional resampling methods. From the experiment, it is shown that the proposed method improves the classification performance by effectively detecting and eliminating the anomalies among true-majority or pseudo-majority class instances.

FecharLer Abstract

2024

Immigrant groups in the Luxembourgish labour market: A Symbolic Data Analysis approach

Autores
Silva, CC; Brito, P; Campos, P;

Publicação
Statistical Journal of the IAOS

Abstract
Luxembourg, known for its immigration history, attracts immigrants to work. This study analyses different immigrant groups in the labour market from 2014 to 2022 by using Labor Force Survey (LFS) data, Symbolic Data Analysis (SDA), and the Monitoring the Evolution of Clusters (MEC) framework. Based on the birthplace and length of residence in Luxembourg, in each year, microdata were aggregated into 21 symbolic objects. They were primarily described by 16 modal variables which are multi-valued variables with a frequency attached to each category. Moreover, clustering using complete linkage and the Chernoff’s distance was applied. The Heuristic Identification of Noisy Variables (HINoV) suggested that with just six variables, objects may be grouped homogeneously. The MEC framework traced temporal relations and transitions between the clusters, revealing some movements across the different years. Results indicate that people from the European Union (EU) and Neighbouring countries have similar profiles while the Portuguese have opposite characteristics. The Luxembourgers are somewhere in between. Profiling people from non-EU countries was challenging. The data and methodology used make it easy to replicate the work in other nations, enabling comparison of results and monitoring to continue in the future.

FecharLer Abstract

Teses
supervisionadas

Teses supervisionadas

Ver todas as teses supervisionadas

2024

The evolution of immigrant groups in Luxembourg - What are the different pathways in the labour market?

Autor
Catarina Campos de Melo Sousa Silva

Instituição
UP-FEP

2024

Anomaly Detection Methods for Complex Data: Applications to Internet Traffic and Financial Markets

Autor
Catarina Padrela Loureiro

Instituição
UP-FEP

2023

Multi-class Classification of Distributional Data

Autor
Ana Carolina Silva Rodrigues dos Santos

Instituição
UP-FEP

2023

The evolution of immigrant groups in Luxembourg - What are the different pathways in the labour market?

Autor
Catarina Campos de Melo Sousa Silva

Instituição
UP-FEP

2023

Searching for Symbolic Patterns in Attributed Networks

Autor
Maria Hermínia Esteves de Carvalho

Instituição
UP-FEP

Ver todas as teses supervisionadas

Sobre

Detalhes

Nome

Cargo

Desde

Nacionalidade

Centro

Contactos

MaLPIS

Parametric models for distributional data

Air Quality Data Analysis with Symbolic Principal Components

Community detection in interval-weighted networks

Anomaly detection-based undersampling for imbalanced classification problems

Immigrant groups in the Luxembourgish labour market: A Symbolic Data Analysis approach

The evolution of immigrant groups in Luxembourg - What are the different pathways in the labour market?

Anomaly Detection Methods for Complex Data: Applications to Internet Traffic and Financial Markets

Multi-class Classification of Distributional Data

The evolution of immigrant groups in Luxembourg - What are the different pathways in the labour market?

Searching for Symbolic Patterns in Attributed Networks