Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pedro Pereira Rodrigues

2020

Informatics as Support for Changes in Health Policy: A Case in Obstetrics

Authors
Gelatti, GJ; Rodrigues, PP; Cruz Correia, RJC;

Publication
PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, VOL 5: HEALTHINF

Abstract
Introduction: In 2015 the Directorate-General for Health of Portugal published new standards (DGS 001/2015) for the registration of cesarean section indicators. The existing scenario was the lack of data, influencing the quality of indicators and analyses on them. The use of a single computer tool was encouraged to register and compare indicators between hospitals with special attention to the Robson Classification as it employs basic information of pregnancy to classify all deliveries in 10 groups. The selected tool was Obscare software. Aim: Describe the scenario on data quality by analyzing the completeness of obstetric records from 2016 to 2018 of the variables used in Robson's classification collected by the Obscare tool. Methods: The completeness is evaluated using a number of missing values. The lower the completeness, the higher the number of missing values. Also, we perform the imputation of data based on basic concepts and analyzed the participation of this data in the indication of the type of delivery to be performed according to classification suggested by DGS 001/2015. Results: From 2016 to 2018. 5922 number of pregnancies resulted in 5922 of Robson Classifications. The variables with lower completeness were related to previous cesarean section (77%) and previous pregnancies (43%). After imputation, it fell to 3.9% and 0.56%, respectively causing 4.6% of discarded data from the total. Discussion: There is a significant amount of missing data in basic variables used to study the classification of delivery type. We believe that encouraging data completion with the possibility of comparing data between hospitals should be a priority in the health area.

2020

tsmp: An R Package for Time Series with Matrix Profile

Authors
Bischoff, F; Rodrigues, PP;

Publication
R JOURNAL

Abstract
This article describes tsmp, an R package that implements the MP concept for TS. The tsmp package is a toolkit that allows all-pairs similarity joins, motif, discords and chains discovery, semantic segmentation, etc. Here we describe how the tsmp package may be used by showing some of the use-cases from the original articles and evaluate the algorithm speed in the R environment. This package can be downloaded at https://CRAN.R-project.org/package=tsmp.

2020

VAE-BRIDGE: Variational Autoencoder Filter for Bayesian Ridge Imputation of Missing Data

Authors
Pereira, RC; Abreu, PH; Rodrigues, PP;

Publication
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
The missing data issue is often found in real-world datasets and it is usually handled with imputation strategies that replace the missing values with new data. Recently, generative models such as Variational Autoencoders have been applied for this imputation task. However, they were always used to perform the entire imputation, which has presented limited results when comparing to other state-of-the-art methods. In this work, a new approach called Variational Autoencoder Filter for Bayesian Ridge Imputation is introduced. It uses a Variational Autoencoder at the beginning of the imputation pipeline to filter the instances that are later fitted to a Bayesian ridge regression used to predict the new values. The approach was compared to four state-of-the-art imputation methods using 10 datasets from the healthcare context covering clinical trials, all injected with missing values under different rates. The proposed approach significantly outperformed the remaining methods in all settings, achieving an overall improvement between 26% and 67%.

2021

CMIID: A comprehensive medical information identifier for clinical search harmonization in Data Safe Havens

Authors
Domingues, MAP; Camacho, R; Rodrigues, PP;

Publication
JOURNAL OF BIOMEDICAL INFORMATICS

Abstract
Over the last decades clinical research has been driven by informatics changes nourished by distinct research endeavors. Inherent to this evolution, several issues have been the focus of a variety of studies: multi-location patient data access, interoperability between terminological and classification systems and clinical practice and records harmonization. Having these problems in mind, the Data Safe Haven paradigm emerged to promote a newborn architecture, better reasoning and safe and easy access to distinct Clinical Data Repositories. This study aim is to present a novel solution for clinical search harmonization within a safe environment, making use of a hybrid coding taxonomy that enables researchers to collect information from multiple repositories based on a clinical domain query definition. Results show that is possible to query multiple repositories using a single query definition based on clinical domains and the capabilities of the Unified Medical Language System, although it leads to deterioration of the framework response times. Participants of a Focus Group and a System Usability Scale questionnaire rated the framework with a median value of 72.5, indicating the hybrid coding taxonomy could be enriched with additional metadata to further improve the refinement of the results and enable the possibility of using this system as data quality tagging mechanism.

2019

Predicting Blood Donations in a Tertiary Care Center Using Time Series Forecasting

Authors
Bischoff, F; Carmo Koch, Md; Rodrigues, PP;

Publication
ICT for Health Science Research - Proceedings of the EFMI 2019 Special Topic Conference - 7-10 April 2019, Hanover, Germany

Abstract
The current algorithm to support platelets stock management assumes that there are always sufficient whole blood donations (WBD) to produce the required amount of pooled platelets. Unfortunately, blood donation rate is uncertain so there is the need to backup pooled platelets productions with single-donor (apheresis) collections to compensate periods of low WBD. The aim of this work was to predict the daily number of WBD to a tertiary care center to preemptively account for a decrease of platelets production. We have collected 62,248 blood donations during 3 years, the daily count of which was used to feed (standalone and ensemble versions of) six prediction models, which were evaluated using the Mean Absolute Error (MAE). Forecast models have shown better performances with a MAE of about 8.6 donations, 34% better than using means or medians alone. Trend lines of donations are better modeled by autoregressive integrated moving average (ARIMA) using a frequency of 365 days, the trade-off being the need for at least two years of data.

2019

MNAR Imputation with Distributed Healthcare Data

Authors
Pereira, RC; Santos, MS; Rodrigues, PP; Abreu, PH;

Publication
Progress in Artificial Intelligence, 19th EPIA Conference on Artificial Intelligence, EPIA 2019, Vila Real, Portugal, September 3-6, 2019, Proceedings, Part II.

Abstract
Missing data is a problem found in real-world datasets that has a considerable impact on the learning process of classifiers. Although extensive work has been done in this field, the MNAR mechanism still remains a challenge for the existing imputation methods, mainly because it is not related with any observed information. Focusing on healthcare contexts, MNAR is present in multiple scenarios such as clinical trials where the participants may be quitting the study for reasons related to the outcome that is being measured. This work proposes an approach that uses different sources of information from the same healthcare context to improve the imputation quality and classification performance for datasets with missing data under MNAR. The experiment was performed with several databases from the medical context and the results show that the use of multiple sources of data has a positive impact in the imputation error and classification performance. © 2019, Springer Nature Switzerland AG.

  • 11
  • 29