Publications

Publications by Alípio Jorge

2016

An Overview of Evolutionary Computing for Interpretation in the Oil and Gas Industry

Authors
Lopes, RL; Jahromi, HN; Jorge, AM;

Publication
Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering, C3S2E '16, Porto, Portugal, July 20-22, 2016

Abstract
The Oil and Gas Exploration & Production (E&P) field deals with high-dimensional heterogeneous data, collected at different stages of the E&P activities from various sources. Over the years different soft-computing algorithms have been proposed for data-driven oil and gas applications. The most popular by far are Artificial Neural Networks, but there are applications of Fuzzy Logic systems, Support Vector Machines, and Evolutionary Algorithms (EAs) as well. This article provides an overview of the applications of EAs in the oil and gas E&P industry. The relevant literature is reviewed and categorised, showing an increasing interest amongst the geoscience community. © 2016 ACM.

CloseRead Abstract

2015

An overview on the exploitation of time in collaborative filtering

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Classic Collaborative Filtering (CF) algorithms rely on the assumption that data are static and we usually disregard the temporal effects in natural user-generated data. These temporal effects include user preference drifts and shifts, seasonal effects, inclusion of new users, and items entering the systemand old ones leavinguser and item activity rate fluctuations and other similar time-related phenomena. These phenomena continuously change the underlying relations between users and items that recommendation algorithms essentially try to capture. In the past few years, a new generation of CF algorithms has emerged, using the time dimension as a key factor to improve recommendation models. In this overview, we present a comprehensive analysis of these algorithms and identify important challenges to be faced in the near future.(C) 2015 John Wiley & Sons, Ltd.

CloseRead Abstract

2018

Assessment of predictive learning methods for the completion of gaps in well log data

Authors
Lopes, RL; Jorge, AM;

Publication
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING

Abstract
Well logs are records of petro-physical data acquired along a borehole, providing direct information about what is in the subsurface. The data collected by logging wells can have significant economic consequences in oil and gas exploration, not only because it has a direct impact on the following decisions, but also due to the subsequent costs inherent to drilling wells, and the potential return of oil deposits. These logs frequently present gaps of varied sizes in the sensor recordings, that happen for diverse reasons. These gaps result in less information used by the interpreter to build the stratigraphic models, and consequently larger uncertainty regarding what will be encountered when the next well is drilled. The main goal of this work is to compare Gradient Tree Boosting, Random Forests, Artificial Neural Networks, and three algorithms of Linear Regression on the prediction of the gaps in well log data. Given the logs from a specific well, we use the intervals with complete information as the training data to learn a regression model of one of the sensors for that well. The algorithms are compared with each other using a few individual example wells with complete information, on which we build artificial gaps to cross validate the results. We show that the ensemble algorithms tend to perform significantly better, and that the results hold when addressing the different examples individually. Moreover, we performed a grid search over the ensembles parameters space, but did not find a statistically significant difference in any situation.

CloseRead Abstract

2014

Classifying Heart Sounds using SAX Motifs, Random Forests and Text Mining techniques

Authors
Gomes, EF; Jorge, AM; Azevedo, PJ;

Publication
PROCEEDINGS OF THE 18TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM (IDEAS14)

Abstract
In this paper we describe an approach to classifying heart sounds (classes Normal, Murmur and Extra-systole) that is based on the discretization of sound signals using the SAX (Symbolic Aggregate Approximation) representation. The ability of automatically classifying heart sounds or at least support human decision in this task is socially relevant to spread the reach of medical care using simple mobile devices or digital stethoscopes. In our approach, sounds are first pre-processed using signal processing techniques (decimate, low-pass filter, normalize, Shannon envelope). Then the pre-processed symbols are transformed into sequences of discrete SAX symbols. These sequences are subject to a process of motif discovery. Frequent sequences of symbols (motifs) are adopted as features. Each sound is then characterized by the frequent motifs that occur in it and their respective frequency. This is similar to the term frequency (TF) model used in text mining. In this paper we compare the TF model with the application of the TFIDF (Term frequency - Inverse Document Frequency) and the use of bi-grams (frequent size two sequences of motifs). Results show the ability of the motifs based TF approach to separate classes and the relative value of the TFIDF and the bi-grams variants. The separation of the Extra-systole class is overly difficult and much better results are obtained for separating the Murmur class. Empirical validation is conducted using real data collected in noisy environments. We have also assessed the cost-reduction potential of the proposed methods by considering a fixed cost model and using a cost sensitive meta algorithm.

CloseRead Abstract

2017

Identifying top relevant dates for implicit time sensitive queries

Authors
Campos, R; Dias, G; Jorge, AM; Nunes, C;

Publication
INFORMATION RETRIEVAL JOURNAL

Abstract
Despite a clear improvement of search and retrieval temporal applications, current search engines are still mostly unaware of the temporal dimension. Indeed, in most cases, systems are limited to offering the user the chance to restrict the search to a particular time period or to simply rely on an explicitly specified time span. If the user is not explicit in his/her search intents (e.g., "philip seymour hoffman'') search engines may likely fail to present an overall historic perspective of the topic. In most such cases, they are limited to retrieving the most recent results. One possible solution to this shortcoming is to understand the different time periods of the query. In this context, most state-of-the-art methodologies consider any occurrence of temporal expressions in web documents and other web data as equally relevant to an implicit time sensitive query. To approach this problem in a more adequate manner, we propose in this paper the detection of relevant temporal expressions to the query. Unlike previous metadata and query log-based approaches, we show how to achieve this goal based on information extracted from document content. However, instead of simply focusing on the detection of the most obvious date we are also interested in retrieving the set of dates that are relevant to the query. Towards this goal, we define a general similarity measure that makes use of co-occurrences of words and years based on corpus statistics and a classification methodology that is able to identify the set of top relevant dates for a given implicit time sensitive query, while filtering out the non-relevant ones. Through extensive experimental evaluation, we mean to demonstrate that our approach offers promising results in the field of temporal information retrieval (T-IR), as demonstrated by the experiments conducted over several baselines on web corpora collections.

CloseRead Abstract

2015

Accelerating Recommender Systems using GPUs

Authors
Rodrigues, AV; Jorge, A; Dutra, I;

Publication
30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II

Abstract
We describe GPU implementations of the matrix recommender algorithms CCD++ and ALS. We compare the processing time and predictive ability of the GPU implementations with existing multi- core versions of the same algorithms. Results on the GPU are better than the results of the multi- core versions (maximum speedup of 14.8).

CloseRead Abstract