Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2017

Managing Diabetes: Pattern Discovery and Counselling supported by user data in a mobile platform

Authors
Machado, D; Paiva, T; Dutra, I; Costa, VS; Brandao, P;

Publication
2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC)

Abstract
Diabetes management is a complex and a sensible problem as each diabetic is a unique case with particular needs. The optimal solution would be a constant monitoring of the diabetic's values and automatically acting accordingly. We propose an approach that guides the user and analyses the data gathered to give individual advice. By using data mining algorithms and methods, we uncover hidden behaviour patterns that may lead to crisis situations. These patterns can then be transformed into logical rules, able to trigger in a particular context, and advise the user. We believe that this solution, is not only beneficial for the diabetic, but also for the doctor accompanying the situation. The advice and rules are useful input that the medical expert can use while prescribing a particular treatment. During the data gathering phase, when the number of records is not enough to attain useful conclusions, a base set of logical rules, defined from medical protocols, directives and/or advice, is responsible for advise and guiding the user. The proposed system will accompany the user at start with generic advice, and with constant learning, advise the user more specifically. We discuss this approach describing the architecture of the system, its base rules and data mining component. The system is to be incorporated in a currently developed diabetes management application for Android.

2017

Markov logic networks for adverse drug event extraction from text

Authors
Natarajan, S; Bangera, V; Khot, T; Picado, J; Wazalwar, A; Costa, VS; Page, D; Caldwell, M;

Publication
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
Adverse drug events (ADEs) are a major concern and point of emphasis for the medical profession, government, and society. A diverse set of techniques from epidemiology, statistics, and computer science are being proposed and studied for ADE discovery from observational health data (e.g., EHR and claims data), social network data (e.g., Google and Twitter posts), and other information sources. Methodologies are needed for evaluating, quantitatively measuring and comparing the ability of these various approaches to accurately discover ADEs. This work is motivated by the observation that text sources such as the Medline/Medinfo library provide a wealth of information on human health. Unfortunately, ADEs often result from unexpected interactions, and the connection between conditions and drugs is not explicit in these sources. Thus, in this work, we address the question of whether we can quantitatively estimate relationships between drugs and conditions from the medical literature. This paper proposes and studies a state-of-the-art NLP-based extraction of ADEs from text.

2017

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

Authors
Kuang, Z; Peissig, PL; Costa, VS; Maclin, R; Page, D;

Publication
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017

Abstract
Several prominent public health incidents [29] that occurred at the beginning of this century due to adverse drug events (ADEs) have raised international awareness of governments and industries about pharmacovigilance (PhV) [6, 7], the science and activities to monitor and prevent adverse events caused by pharmaceutical products after they are introduced to the market. A major data source for PhV is large-scale longitudinal observational databases (LODs) [6] such as electronic health records (EHRs) and medical insurance claim databases. Inspired by the Multiple Self-Controlled Case Series (MSCCS) model [27], arguably the leading method for ADE discovery from LODs, we propose baseline regularization, a regularized generalized linear model that leverages the diverse health profiles available in LODs across different individuals at different times. We apply the proposed method as well as MSCCS to the Marshfield Clinic EHR. Experimental results suggest that incorporatingthe heterogeneity among different patients and different times help to improve the performance in identifying benchmark ADEs from the Observational Medical Outcomes Partnership ground truth [26]. © 2017 Copyright held by the owner/author(s).

2017

On the use of stochastic local search techniques to revise first-order logic theories from examples

Authors
Paes, A; Zaverucha, G; Costa, VS;

Publication
MACHINE LEARNING

Abstract
Theory Revision from Examples is the process of repairing incorrect theories and/or improving incomplete theories from a set of examples. This process usually results in more accurate and comprehensible theories than purely inductive learning. However, so far, progress on the use of theory revision techniques has been limited by the large search space they yield. In this article, we argue that it is possible to reduce the search space of a theory revision system by introducing stochastic local search. More precisely, we introduce a number of stochastic local search components at the key steps of the revision process, and implement them on a state-of-the-art revision system that makes use of the most specific clause to constrain the search space. We show that with the use of these SLS techniques it is possible for the revision system to be executed in a feasible time, while still improving the initial theory and in a number of cases even reaching better accuracies than the deterministic revision process. Moreover, in some cases the revision process can be faster and still achieve better accuracies than an ILP system learning from an empty initial hypothesis or assuming an initial theory to be correct.

2017

Automatic Documents Counterfeit Classification Using Image Processing and Analysis

Authors
Vieira, R; Antunes, M; Silva, C; Assis, A;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017)

Abstract
Counterfeit detection in official documents has challenged forensic experts on trying to correlate them to improve the identification of forgery authors by criminal investigators. Past counterfeit investigation on the Portuguese Police Forensic Laboratory allowed the construction of an organized set of digital images related to counterfeited documents, helping manual identification of new counterfeiters modus operandi. However, these images are usually stored in distinct resolutions, may have different sizes and could have been captured under different types of illumination. In this paper we present a methodology to automate a counterfeit identification modus operandi, by comparing a given document image with a database of previously catalogued counterfeited documents images. The proposed method ranks the identified counterfeited documents and allows the forensic experts to drive their attention to the most similar documents. It takes advantage of scalable algorithms under the OpenCV framework that compare images, match patterns and analyse textures and colours. We present a set of tests with distinct datasets with promising results.

2017

Performance Metrics for Model Fusion in Twitter Data Drifts

Authors
Costa, J; Silva, C; Antunes, M; Ribeiro, B;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017)

Abstract
Ensemble approaches have revealed remarkable abilities to tackle different learning challenges, namely in dynamic scenarios with concept drift, e.g. in social networks, as Twitter. Several efforts have been engaged in defining strategies to combine the models that constitute an ensemble. In this work, we investigate the effect of using different metrics for combining ensembles' models, specifically performance-based metrics. We propose five performance combining metrics, having in mind that we may take advantage of diversity in classifiers, as their individual performance takes a leading role in defining their contribution to the ensemble. Experimental results on a Twitter dataset, artificially timestamped, suggest that using performance metrics to combine the models that constitute an ensemble can introduce relevant improvements in the overall ensemble performance.

  • 76
  • 192