Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2016

Probabilistic Forecasting of Day-ahead Electricity Prices for the Iberian Electricity Market

Authors
Moreira, R; Bessa, R; Gama, J;

Publication
2016 13TH INTERNATIONAL CONFERENCE ON THE EUROPEAN ENERGY MARKET (EEM)

Abstract
With the liberalization of the electricity markets, price forecasting has become crucial for the decision-making process of market agents. The unique features of electricity price, such as non-stationary, non-linearity and high volatility make this a very difficult task. For this reason, rather than a simple point forecast, market participants are more interested in a probabilistic forecast that is essential to estimate the uncertainty involved in the price. By focusing on this issue, the aim of this paper is to analyze the impact of external factors in the electricity price and present a methodology for probabilistic forecasting of day-ahead electricity prices from the Iberian electricity market. The models are built using regression techniques and aim to obtain, for each hour, the quantiles of 5% to 95% by steps of 5%.

2016

Sampling Evolving Ego-Networks with forgetting Factor

Authors
Tabassum, Shazia; Gama, Joao;

Publication
IEEE 17th International Conference on Mobile Data Management, MDM 2016, Porto, Portugal, June 13-16, 2016 - Workshops

Abstract

2016

Sampling massive streaming call graphs

Authors
Tabassum, S; Gama, J;

Publication
Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, April 4-8, 2016

Abstract
The problem of analyzing massive graph streams in real time is growing along with the size of streams. Sampling techniques have been used to analyze these streams in real time. However, it is difficult to answer questions like, which structures are well preserved by the sampling techniques over the evolution of streams? Which sampling techniques yield proper estimates for directed and weighted graphs? Which techniques have least time complexity etc? In this work, we have answered the above questions by comparing and analyzing the evolutionary samples of such graph streams. We have evaluated sequential sampling techniques by comparing the structural metrics from their samples. We have also presented a biased version of reservoir sampling, which shows better comparative results in our scenario. We have carried out rigorous experiments over a massive stream of 3 hundred million calls made by 11 million anonymous subscribers over 31 days. We evaluated node based and edge based methods of sampling. We have compared the samples generated by using sequential algorithms like, space saving algorithm for finding topK items, reservoir sampling, and a biased version of reservoir sampling. Our overall results and observations show that edge based samples perform well in our scenario. We have also compared the distribution of degrees and biases of evolutionary samples. © 2016 ACM.

2016

Social Network Analysis in Streaming Call Graphs

Authors
Sarmento, R; Oliveira, M; Cordeiro, M; Tabassum, S; Gama, J;

Publication
Studies in Big Data

Abstract
Mobile phones are powerful tools to connect people. The streams of Call Detail Records (CDR’s) generating from these devices provide a powerful abstraction of social interactions between individuals, representing social structures. Call graphs can be deduced from these CDRs, where nodes represent subscribers and edges represent the phone calls made. These graphs may easily reach millions of nodes and billions of edges. Besides being large-scale and generated in real-time, the underlying social networks are inherently complex and, thus, difficult to analyze. Conventional data analysis performed by telecom operators is slow, done by request and implies heavy costs in data warehouses. In face of these challenges, real-time streaming analysis becomes an ever increasing need to mobile operators, since it enables them to quickly detect important network events and optimize business operations. Sampling, together with visualization techniques, are required for online exploratory data analysis and event detection in such networks. In this chapter, we report the burgeoning body of research in network sampling, visualization of streaming social networks, stream analysis and the solutions proposed so far. © 2016, Springer International Publishing Switzerland.

2016

Time-evolving O-D matrix estimation using high-speed GPS data streams

Authors
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Portable digital devices equipped with GPS antennas are ubiquitous sources of continuous information for location-based Expert and Intelligent Systems. The availability of these traces on the human mobility patterns is growing explosively. To mine this data is a fascinating challenge which can produce a big impact on both travelers and transit agencies. This paper proposes a novel incremental framework to maintain statistics on the urban mobility dynamics over a time-evolving origin-destination (O-D) matrix. The main motivation behind such task is to be able to learn from the location-based samples which are continuously being produced, independently on their source, dimensionality or (high) communicational rate. By doing so, the authors aimed to obtain a generalist framework capable of summarizing relevant context-aware information which is able to follow, as close as possible, the stochastic dynamics on the human mobility behavior. Its potential impact ranges Expert Systems for decision support across multiple industries, from demand estimation for public transportation planning till travel time prediction for intelligent routing systems, among others. The proposed methodology settles on three steps: (i) Half-Space trees are used to divide the city area into dense subregions of equal mass. The uncovered regions form an O-D matrix which can be updated by transforming the trees'leaves into conditional nodes (and vice-versa). The (ii) Partioning Incremental Algorithm is then employed to discretize the target variable's historical values on each matrix cell. Finally, a (iii) dimensional hierarchy is defined to discretize the domains of the independent variables depending on the cell's samples. A Taxi Network running on a mid-sized city in Portugal was selected as a case study. The Travel Time Estimation (TTE) problem was regarded as a real-world application. Experiments using one million data samples were conducted to validate the methodology. The results obtained highlight the straightforward contribution of this method: it is capable of resisting to the drift while still approximating context-aware solutions through a multidimensional discretization of the feature space. It is a step ahead in estimating the real-time mobility dynamics, regardless of its application field.

2013

Predicting Taxi-Passenger Demand Using Streaming Data

Authors
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Abstract
Informed driving is increasingly becoming a key feature for increasing the sustainability of taxi companies. The sensors that are installed in each vehicle are providing new opportunities for automatically discovering knowledge, which, in return, delivers information for real-time decision making. Intelligent transportation systems for taxi dispatching and for finding time-saving routes are already exploring these sensing data. This paper introduces a novel methodology for predicting the spatial distribution of taxi-passengers for a short-term time horizon using streaming data. First, the information was aggregated into a histogram time series. Then, three time-series forecasting techniques were combined to originate a prediction. Experimental tests were conducted using the online data that are transmitted by 441 vehicles of a fleet running in the city of Porto, Portugal. The results demonstrated that the proposed framework can provide effective insight into the spatiotemporal distribution of taxi-passenger demand for a 30-min horizon.

  • 5
  • 90