Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por João Gama

2020

BRIGHT-Drift-Aware Demand Predictions for Taxi Networks

Autores
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;

Publicação
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Massive data broadcast by GPS-equipped vehicles provide unprecedented opportunities. One of the main tasks in order to optimize our transportation networks is to build data-driven real-time decision support systems. However, the dynamic environments where the networks operate disallow the traditional assumptions required to put in practice many off-the-shelf supervised learning algorithms, such as finite training sets or stationary distributions. In this paper, we propose BRIGHT: a drift-aware supervised learning framework to predict demand quantities. BRIGHT aims to provide accurate predictions for short-term horizons through a creative ensemble of time series analysis methods that handles distinct types of concept drift. By selecting neighborhoods dynamically, BRIGHT reduces the likelihood of overfitting. By ensuring diversity among the base learners, BRIGHT ensures a high reduction of variance while keeping bias stable. Experiments were conducted using three large-scale heterogeneous real-world transportation networks in Porto (Portugal), Shanghai (China), and Stockholm (Sweden), as well as with controlled experiments using synthetic data where multiple distinct drifts were artificially induced. The obtained results illustrate the advantages of BRIGHT in relation to state-of-the-art methods for this task.

2019

Main Factors Driving the Open Rate of Email Marketing Campaigns

Autores
Conceição, A; Gama, J;

Publicação
Discovery Science - 22nd International Conference, DS 2019, Split, Croatia, October 28-30, 2019, Proceedings

Abstract
Email Marketing is one of the most important traffic sources in Digital Marketing. It yields a high return on investment for the company and offers a cheap and fast way to reach existent or potential clients. Getting the recipients to open the email is the first step for a successful campaign. Thus, it is important to understand how marketers can improve the open rate of a marketing campaign. In this work, we analyze what are the main factors driving the open rate of financial email marketing campaigns. For that purpose, we develop a classification algorithm that can accurately predict if a campaign will be labeled as Successful or Failure. A campaign is classified as Successful if it has an open rate higher than the average, otherwise it is labeled as Failure. To achieve this, we have employed and evaluated three different classifiers. Our results showed that it is possible to predict the performance of a campaign with approximately 82% accuracy, by using the Random Forest algorithm and the redundant filter selection technique. With this model, marketers will have the chance to sooner correct potential problems in a campaign that could highly impact its revenue. Additionally, a text analysis of the subject line and preheader was performed to discover which keywords and keyword combinations trigger a higher open rate. The results obtained were then validated in a real setting through A/B testing. © Springer Nature Switzerland AG 2019.

2020

Impact of Trust and Reputation Based Brokerage on the CloudAnchor Platform

Autores
Veloso, B; Malheiro, B; Burguillo, JC; Gama, J;

Publicação
Advances in Practical Applications of Agents, Multi-Agent Systems, and Trustworthiness. The PAAMS Collection - 18th International Conference, PAAMS 2020, L'Aquila, Italy, October 7-9, 2020, Proceedings

Abstract
This paper analyses the impact of trust and reputation modelling on CloudAnchor, a business-to-business brokerage platform for the transaction of single and federated resources on behalf of Small and Medium Sized Enterprises (SME). In CloudAnchor, businesses act as providers or consumers of Infrastructure as a Service (IaaS) resources. The platform adopts a multi-layered multi-agent architecture, where providers, consumers and virtual providers, representing provider coalitions, engage in trust & reputation-based provider look-up, invitation, acceptance and resource negotiations. The goal of this work is to assess the relevance of the distributed trust model and centralised fuzzified reputation service in the number of resources successfully transacted, the global turnover, brokerage fees, losses, expenses and time response. The results show that trust and reputation based brokerage has a positive impact on the CloudAnchor performance by reducing losses and the execution time for the provision of both single and federated resources and increasing considerably the number of federated resources provided. © 2020, Springer Nature Switzerland AG.

2020

REST framework: A modelling approach towards cooling energy stress mitigation plans for future cities in warming Global South

Autores
Bardhan, R; Debnath, R; Gama, J; Vijay, U;

Publicação
SUSTAINABLE CITIES AND SOCIETY

Abstract
Future cities of the Global South will not only rapidly urbanise but will also get warmer from climate change and urbanisation induced effects. It will trigger a multi-fold increase in cooling demand, especially at a residential level, mitigation to which remains a policy and research gap. This study forwards a novel residential energy stress mitigation framework called REST to estimate warming climate-induced energy stress in residential buildings using a GIS-driven urban heat island and energy modelling approach. REST further estimates rooftop solar potential to enable solar photo-voltaic (PV) based decentralised energy solutions and establish an optimised routine for peer-to-peer energy sharing at a neighbourhood scale. The optimised network is classified through a decision tree algorithm to derive sustainability rules for mitigating energy stress at an urban planning scale. These sustainability rules established distributive energy justice variables in urban planning context. The REST framework is applied as a proof-of-concept on a future smart city of India, named Amaravati. Results show that cooling energy stress can be reduced by 80 % in the study area through sensitive use of planning variables like Floor Space Index (FSI) and built-up density. It has crucial policy implications towards the design and implementation of a national level cooling action plans in the future cities of the Global South to meet the UN-SDG - 7 (clean and affordable energy) and SDG - 11 (sustainable cities and communities) targets.

2019

Pruned Sets for Multi-Label Stream Classification without True Labels

Autores
Costa Junior, JD; Faria, ER; Silva, JA; Gama, J; Cerri, R;

Publicação
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
In multi-label classification problems an example can be simultaneously classified into more than one class. This is also a challenging task in Data Streams (DS) classification, where unbounded and non-stationary distributed multi-label data contain multiple concepts that drift at different rates and patterns. In addition, the true labels of the examples may never become available and updating classification models in a supervised fashion is unfeasible. In this paper, we propose a Multi-Label Stream Classification (MLSC) method applying a Novelty Detection (ND) procedure task to update the classification model detecting any new patterns in the examples, which differ in some aspects from observed patterns, in an unsupervised fashion without any external feedback. Although ND is suitable for multi-class stream classification, it is still a not well-investigated task for multi-label problems. We improve a initial work proposed in [1] and extended it with a new Pruned Sets (PS) transformation strategy. The experiments showed that our method presents competitive performances over data sets with different concept drifts, and outperform, in some aspects, the baseline methods.

2020

A drift detection method based on dynamic classifier selection

Autores
Pinage, F; dos Santos, EM; Gama, J;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Machine learning algorithms can be applied to several practical problems, such as spam, fraud and intrusion detection, and customer preferences, among others. In most of these problems, data come in streams, which mean that data distribution may change over time, leading to concept drift. The literature is abundant on providing supervised methods based on error monitoring for explicit drift detection. However, these methods may become infeasible in some real-world applications-where there is no fully labeled data available, and may depend on a significant decrease in accuracy to be able to detect drifts. There are also methods based on blind approaches, where the decision model is updated constantly. However, this may lead to unnecessary system updates. In order to overcome these drawbacks, we propose in this paper a semi-supervised drift detector that uses an ensemble of classifiers based on self-training online learning and dynamic classifier selection. For each unknown sample, a dynamic selection strategy is used to choose among the ensemble's component members, the classifier most likely to be the correct one for classifying it. The prediction assigned by the chosen classifier is used to compute an estimate of the error produced by the ensemble members. The proposed method monitors such a pseudo-error in order to detect drifts and to update the decision model only after drift detection. The achievement of this method is relevant in that it allows drift detection and reaction and is applicable in several practical problems. The experiments conducted indicate that the proposed method attains high performance and detection rates, while reducing the amount of labeled data used to detect drift.

  • 36
  • 89