Publications

Publications by João Mendes Moreira

2016

An online learning approach to eliminate Bus Bunching in real-time

Authors
Moreira Matias, L; Cats, O; Gama, J; Mendes Moreira, J; de Sousa, JF;

Publication
APPLIED SOFT COMPUTING

Abstract
Recent advances in telecommunications created new opportunities for monitoring public transport operations in real-time. This paper presents an automatic control framework to mitigate the Bus Bunching phenomenon in real-time. The framework depicts a powerful combination of distinct Machine Learning principles and methods to extract valuable information from raw location-based data. State-of-the-art tools and methodologies such as Regression Analysis, Probabilistic Reasoning and Perceptron's learning with Stochastic Gradient Descent constitute building blocks of this predictive methodology. The prediction's output is then used to select and deploy a corrective action to automatically prevent Bus Bunching. The performance of the proposed method is evaluated using data collected from 18 bus routes in Porto, Portugal over a period of one year. Simulation results demonstrate that the proposed method can potentially reduce bunching by 68% and decrease average passenger waiting times by 4.5%, without prolonging in-vehicle times. The proposed system could be embedded in a decision support system to improve control room operations. (C) 2016 Published by Elsevier B.V.

CloseRead Abstract

2016

Concept Neurons - Handling Drift Issues for Real-Time Industrial Data Mining

Authors
Moreira Matias, L; Gama, J; Mendes Moreira, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2016, PT III

Abstract
Learning from data streams is a challenge faced by data science professionals from multiple industries. Most of them struggle hardly on applying traditional Machine Learning algorithms to solve these problems. It happens so due to their high availability on ready-to-use software libraries on big data technologies (e.g. SparkML). Nevertheless, most of them cannot cope with the key characteristics of this type of data such as high arrival rate and/or non-stationary distributions. In this paper, we introduce a generic and yet simplistic framework to fill this gap denominated Concept Neurons. It leverages on a combination of continuous inspection schemas and residual-based updates over the model parameters and/or the model output. Such framework can empower the resistance of most of induction learning algorithms to concept drifts. Two distinct and hence closely related flavors are introduced to handle different drift types. Experimental results on successful distinct applications on different domains along transportation industry are presented to uncover the hidden potential of this methodology.

CloseRead Abstract

2015

Improving the accuracy of long-term travel time prediction using heterogeneous ensembles

Authors
Mendes Moreira, J; Jorge, AM; de Sousa, JF; Soares, C;

Publication
NEUROCOMPUTING

Abstract
This paper is about long-term travel time prediction in public transportation. However, it can be useful for a wider area of applications. It follows a heterogeneous ensemble approach with dynamic selection. A vast set of experiments with a pool of 128 tuples of algorithms and parameter sets (a&ps) has been conducted for each of the six studied routes. Three different algorithms, namely, random forest, projection pursuit regression and support vector machines, were used. Then, ensembles of different sizes were obtained after a pruning step. The best approach to combine the outputs is also addressed. Finally, the best ensemble approach for each of the six routes is compared with the best individual a&ps. The results confirm that heterogeneous ensembles are adequate for long-term travel time prediction. Namely, they achieve both higher accuracy and robustness along time than state-of-the-art learners.

CloseRead Abstract

2016

Time-evolving O-D matrix estimation using high-speed GPS data streams

Authors
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Portable digital devices equipped with GPS antennas are ubiquitous sources of continuous information for location-based Expert and Intelligent Systems. The availability of these traces on the human mobility patterns is growing explosively. To mine this data is a fascinating challenge which can produce a big impact on both travelers and transit agencies. This paper proposes a novel incremental framework to maintain statistics on the urban mobility dynamics over a time-evolving origin-destination (O-D) matrix. The main motivation behind such task is to be able to learn from the location-based samples which are continuously being produced, independently on their source, dimensionality or (high) communicational rate. By doing so, the authors aimed to obtain a generalist framework capable of summarizing relevant context-aware information which is able to follow, as close as possible, the stochastic dynamics on the human mobility behavior. Its potential impact ranges Expert Systems for decision support across multiple industries, from demand estimation for public transportation planning till travel time prediction for intelligent routing systems, among others. The proposed methodology settles on three steps: (i) Half-Space trees are used to divide the city area into dense subregions of equal mass. The uncovered regions form an O-D matrix which can be updated by transforming the trees'leaves into conditional nodes (and vice-versa). The (ii) Partioning Incremental Algorithm is then employed to discretize the target variable's historical values on each matrix cell. Finally, a (iii) dimensional hierarchy is defined to discretize the domains of the independent variables depending on the cell's samples. A Taxi Network running on a mid-sized city in Portugal was selected as a case study. The Travel Time Estimation (TTE) problem was regarded as a real-world application. Experiments using one million data samples were conducted to validate the methodology. The results obtained highlight the straightforward contribution of this method: it is capable of resisting to the drift while still approximating context-aware solutions through a multidimensional discretization of the feature space. It is a step ahead in estimating the real-time mobility dynamics, regardless of its application field.

CloseRead Abstract

2016

Towards Automatic Generation of Metafeatures

Authors
Pinto, F; Soares, C; Mendes Moreira, J;

Publication
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT I

Abstract
The selection of metafeatures for metalearning (MtL) is often an ad hoc process. The lack of a proper motivation for the choice of a metafeature rather than others is questionable and may originate a loss of valuable information for a given problem (e.g., use of class entropy and not attribute entropy). We present a framework to systematically generate metafeatures in the context of MtL. This framework decomposes a metafeature into three components: meta-function, object and post-processing. The automatic generation of metafeatures is triggered by the selection of a meta-function used to systematically generate metafeatures from all possible combinations of object and post-processing alternatives. We executed experiments by addressing the problem of algorithm selection in classification datasets. Results show that the sets of systematic metafeatures generated from our framework are more informative than the non-systematic ones and the set regarded as state-of-the-art.

CloseRead Abstract

2015

Urban Logistics Integrated in a Multimodal Mobility System

Authors
de Sousa, JF; Mendes Moreira, J;

Publication
2015 IEEE 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS

Abstract
In this paper we briefly present our feelings about urban logistic and its role in urban mobility. In some way, we can say that this is a position paper based on an extensive review of all known related published material. We support the development of new approaches for the management of passenger and freight transport together as a single logistics system; based on the access to more and more sophisticated flows of data and better communication means, we envisage the dissemination of sufficient information for the correct decision of every citizens between several mobility options in real time (especially with the support of mobile technology); and we sustain that new tools are needed to help the design of innovative business models and policies, and the change of habits and behaviors. We visualize urban logistics as a multi-stakeholder, multi-criteria and multimodal mobility dynamic system.

CloseRead Abstract