Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2008

Knowledge discovery from sensor data

Authors
Ganguly, AR; Gama, J; Omitaomu, OA; Gaber, MM; Vatsavai, RR;

Publication
Knowledge Discovery from Sensor Data

Abstract
As sensors become ubiquitous, a set of broad requirements is beginning to emerge across high-priority applications including disaster preparedness and management, adaptability to climate change, national or homeland security, and the management of critical infrastructures. This book presents innovative solutions in offline data mining and real-time analysis of sensor or geographically distributed data. It discusses the challenges and requirements for sensor data based knowledge discovery solutions in high-priority application illustrated with case studies. It explores the fusion between heterogeneous data streams from multiple sensor types and applications in science, engineering, and security. © 2009 by Taylor & Francis Group, LLC.

2008

Introduction

Authors
Ganguly, AR; Gama, J; Omitaomu, OA; Gaber, MM; Vatsavai, RR;

Publication
Knowledge Discovery from Sensor Data

Abstract

2007

OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams

Authors
Spinosa, EJ; de Carvalho, APDF; Gama, J;

Publication
APPLIED COMPUTING 2007, VOL 1 AND 2

Abstract
A machine learning approach that is capable of treating data streams presents new challenges and enables the analysis of a variety of real problems in which concepts change over time. In this scenario, the ability to identify novel concepts as well as to deal with concept drift axe two important attributes. This paper presents a technique based on the k-means clustering algorithm aimed at considering those two situations in a single learning strategy. Experimental results performed with data from various domains provide insight into how clustering algorithms can be used for the discovery of new concepts in streams of data.

2008

Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks

Authors
Spinosa, EJ; de Carvalho, APDF; Gama, J;

Publication
APPLIED COMPUTING 2008, VOLS 1-3

Abstract
In this paper, a cluster-based novelty detection technique capable of dealing with a large amount of data is presented and evaluated in the context of intrusion detection. Starting with examples of a single class that describe the normal profile, the proposed technique detects novel concepts initially as cohesive clusters of examples and later as sets of clusters in an unsupervised incremental learning fashion. Experimental results with the KDD Cup 1999 data set show that the technique is capable of dealing with data streams, successfully learning novel concepts that are pure in terms of the real class structure.

2009

Adaptive Bayesian network classifiers

Authors
Castillo, G; Gama, J;

Publication
INTELLIGENT DATA ANALYSIS

Abstract
This paper is concerned with adaptive learning algorithms for Bayesian network classifiers in a prequential (on-line) learning scenario. In this scenario, new data is available over time. An efficient supervised learning algorithm must be able to improve its predictive accuracy by incorporating the incoming data, while optimizing the cost of updating. However, if the process is not strictly stationary, the target concept could change over time. Hence, the predictive model should be adapted quickly to these changes. The main contribution of this work is a proposal of an unified, adaptive prequential framework for supervised learning called AdPreqFr4SL, which attempts to handle the cost-performance trade-off and deal with concept drift. Starting with the simple Naive Bayes, we scale up the complexity by gradually increasing the maximum number of allowable attribute dependencies, and then by searching for new dependences in the extended search space. Since updating the structure is a costly task, we use new data to primarily adapt the parameters. We adapt the structure only when is actually necessary. The method for handling concept drift is based on the Shewhart P-Chart. We experimentally prove the advantages of using the AdPreqFr4SL in comparison with its non-adaptive versions.

2009

Knowledge discovery from data streams Introduction

Authors
Gama, J; Ganguly, A; Omitaomu, O; Vatsavai, R; Gaber, M;

Publication
INTELLIGENT DATA ANALYSIS

Abstract

  • 53
  • 89