Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Carlos Manuel Soares

2014

Web mining for the integration of data mining with business intelligence in web-based decision support systems

Authors
Domingues, MA; Jorge, AM; Soares, C; Rezende, SO;

Publication
Integration of Data Mining in Business Intelligence Systems

Abstract
Web mining can be defined as the use of data mining techniques to automatically discover and extract information from web documents and services. A decision support system is a computer-based information sy Analysis stem that supports business or organizational decision-making activities. Data mining and business intelligence techniques can be integrated in order to develop more advanced decision support systems. In this chapter, the authors propose to use web mining as a process to develop advanced decision support systems in order to support the management activities of a website. They describe the Web mining process as a sequence of steps for the development of advanced decision support systems. By following such a sequence, the authors can develop advanced decision support systems, which integrate data mining with business intelligence, for websites. © 2015, IGI Global.

2014

A framework to decompose and develop metafeatures

Authors
Pinto, F; Soares, C; Mendes Moreira, J;

Publication
CEUR Workshop Proceedings

Abstract
This paper proposes a framework to decompose and develop metafeatures for Metalearning (MtL) problems. Several metafeatures (also known as data characteristics) are proposed in the literature for a wide range of problems. Since MtL applicability is very general but problem dependent, researchers focus on generating specific and yet informative metafeatures for each problem. This process is carried without any sort of conceptual framework. We believe that such framework would open new horizons on the development of metafeatures and also aid the process of understanding the metafeatures already proposed in the state-of-the-art. We propose a framework with the aim of fill that gap and we show its applicability in a scenario of algorithm recommendation for regression problems.

2016

Entropy-based discretization methods for ranking data

Authors
de Sa, CR; Soares, C; Knobbe, A;

Publication
INFORMATION SCIENCES

Abstract
Label Ranking (LR) problems are becoming increasingly important in Machine Learning. While there has been a significant amount of work on the development of learning algorithms for LR in recent years, there are not many pre-processing methods for LR Some methods, like Naive Bayes for LR and APRIORI-LR, cannot handle real-valued data directly. Conventional discretization methods used in classification are not suitable for LR problems, due to the different target variable. In this work, we make an extensive analysis of the existing methods using simple approaches. We also propose a new method called EDiRa (Entropy-based Discretization for Ranking) for the discretization of ranking data. We illustrate the advantages of the method using synthetic data and also on several benchmark datasets. The results clearly indicate that the discretization is performing as expected and also improves the results and efficiency of the learning algorithms.

2016

Exceptional Preferences Mining

Authors
de Sa, CR; Duivesteijn, W; Soares, C; Knobbe, A;

Publication
DISCOVERY SCIENCE, (DS 2016)

Abstract
Exceptional Preferences Mining (EPM) is a crossover between two subfields of datamining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where the preference relations between subsets of the labels significantly deviate from the norm; a variant of Subgroup Discovery, with rankings as the (complex) target concept. We employ three quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes 'exceptional' varies with the quality measure: the first gauges exceptional overall ranking behavior, the second indicates whether a particular label stands out from the rest, and the third highlights subgroups featuring unusual pairwise label ranking behavior. As proof of concept, we explore five datasets. The results confirm that the new task EPM can deliver interesting knowledge. The results also illustrate how the visualization of the preferences in a Preference Matrix can aid in interpreting exceptional preference subgroups.

2014

Proceedings of the International Workshop on Meta-learning and Algorithm Selection co-located with 21st European Conference on Artificial Intelligence, MetaSel@ECAI 2014, Prague, Czech Republic, August 19, 2014

Authors
Vanschoren, J; Brazdil, P; Soares, C; Kotthoff, L;

Publication
MetaSel@ECAI

Abstract

2016

Selecting Collaborative Filtering Algorithms Using Metalearning

Authors
Cunha, T; Soares, C; Carvalho, ACPLFd;

Publication
Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II

Abstract
Recommender Systems are an important tool in e-business, for both companies and customers. Several algorithms are available to developers, however, there is little guidance concerning which is the best algorithm for a specific recommendation problem. In this study, a metalearning approach is proposed to address this issue. It consists of relating the characteristics of problems (metafeatures) to the performance of recommendation algorithms. We propose a set of metafeatures based on the application of systematic procedure to develop metafeatures and by extending and generalizing the state of the art metafeatures for recommender systems. The approach is tested on a set of Matrix Factorization algorithms and a collection of real-world Collaborative Filtering datasets. The performance of these algorithms in these datasets is evaluated using several standard metrics. The algorithm selection problem is formulated as classification tasks, where the target attribute is the best Matrix Factorization algorithm, according to each metric. The results show that the approach is viable and that the metafeatures used contain information that is useful to predict the best algorithm for a dataset. © Springer International Publishing AG 2016.

  • 7
  • 38