Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2009

Improving the efficiency of inductive logic programming systems

Authors
Fonseca, NA; Costa, VS; Rocha, R; Camacho, R; Silva, F;

Publication
SOFTWARE-PRACTICE & EXPERIENCE

Abstract
Inductive logic programming (ILP) is a sub-field of machine learning that provides an excellent framework for multi-relational data mining applications. The advantages of ILP have been successfully demonstrated in complex and relevant industrial and scientific problems. However, to produce valuable models, ILP systems often require long running times and large amounts of memory. In this paper we address fundamental issues that have direct impact on the efficiency of ILP systems. Namely, we discuss how improvements in the indexing mechanisms of an underlying logic programming system benefit ILP performance. Furthermore, we propose novel data structures to reduce memory requirements and we suggest a new lazy evaluation technique to search the hypothesis space more efficiently. These proposals have been implemented in the April ILP system and evaluated using several well-known data sets. The results observed show significant improvements in running time without compromising the accuracy of the models generated. Indeed, the combined techniques achieve several order of magnitudes speedup in some data sets. Moreover, memory requirements are reduced in nearly half of the data sets. Copyright (C) 2008 John Wiley & Sons, Ltd.

2009

Parallel ILP for distributed-memory architectures

Authors
Fonseca, NA; Srinivasan, A; Silva, F; Camacho, R;

Publication
MACHINE LEARNING

Abstract
The growth of machine-generated relational databases, both in the sciences and in industry, is rapidly outpacing our ability to extract useful information from them by manual means. This has brought into focus machine learning techniques like Inductive Logic Programming (ILP) that are able to extract human-comprehensible models for complex relational data. The price to pay is that ILP techniques are not efficient: they can be seen as performing a form of discrete optimisation, which is known to be computationally hard; and the complexity is usually some super-linear function of the number of examples. While little can be done to alter the theoretical bounds on the worst-case complexity of ILP systems, some practical gains may follow from the use of multiple processors. In this paper we survey the state-of-the-art on parallel ILP. We implement several parallel algorithms and study their performance using some standard benchmarks. The principal findings of interest are these: (1) of the techniques investigated, one that simply constructs models in parallel on each processor using a subset of data and then combines the models into a single one, yields the best results; and (2) sequential (approximate) ILP algorithms based on randomized searches have lower execution times than (exact) parallel algorithms, without sacrificing the quality of the solutions found.

2009

Parallel calculation of multi-electrode array correlation networks

Authors
Ribeiro, P; Simonotto, J; Kaiser, M; Silva, F;

Publication
JOURNAL OF NEUROSCIENCE METHODS

Abstract
When calculating correlation networks from multi-electrode array (MEA) data, one works with extensive computations. Unfortunately, as the MEAs grow bigger, the time needed for the computation grows even more: calculating pair-wise correlations for current 60 channel systems can take hours on normal commodity computers whereas for future 1000 channel systems it would take almost 280 times as long, given that the number of pairs increases with the square of the number of channels. Even taking into account the increase of speed in processors, soon it can be unfeasible to compute correlations in a single computer. Parallel computing is a way to sustain reasonable calculation times in the future. We provide a general tool for rapid computation of correlation networks which was tested for: (a) a single computer cluster with 16 cores, (b) the Newcastle Condor System utilizing idle processors of university computers and (c) the inter-cluster, with 192 cores. Our reusable tool provides a simple interface for neuroscientists, automating data partition and job submission, and also allowing coding in any programming language. It is also sufficiently flexible to be used in other high-performance computing environments.

2009

Responding to questionnaires on the Web using XwQuest

Authors
Leal, JP;

Publication
Proceedings of the IADIS International Conference WWW/Internet 2009, ICWI 2009

Abstract
This paper reports on the design, implementation and evaluation of XwQuest, a Web tool for responding to questionnaires on the Web. The distinctive feature of this tool is an XML definition of the questionnaire that focus on questions and admissible answers while avoiding presentation details. This questionnaire definition is processed on the browser side and converted to an Ajax application. Collected responses are periodically sent back to the server and can be retrieved by researchers and processed on a standard spreadsheet program. The paper details the questionnaire language of XwQuest and a generator that converts them into Ajax applications. Two case studies where XwQuest was actually used with good results are also presented. © 2009 IADIS.

2009

CrimsonHex: A Service Oriented Repository of Specialised Learning Objects

Authors
Leal, JP; Qleiros, R;

Publication
ENTERPRISE INFORMATION SYSTEMS-BK

Abstract
The corner stone of the interoperability of eLearning systems is the standard definition of learning objects. Nevertheless, for some domains this standard is insufficient to fully describe all the assets, especially when they are used as input for other eLearning services. On the other hand, a standard definition of learning objects in not enough to ensure interoperability among eLearning systems; they must also use a standard API to exchange learning objects. This paper presents the design and implementation of a service oriented repository of learning objects called crimsonHex. This repository is fully compliant with the existing interoperability standards and supports new definitions of learning objects for specialized domains. We illustrate this feature with the definition of programming problems as learning objects and its validation by the repository. This repository is also prepared to store usage data on learning objects to tailor the presentation order and adapt it to learner profiles.

2009

Defining programming problems as learning objects

Authors
Leal, JP; Queiros, R;

Publication
World Academy of Science, Engineering and Technology

Abstract
Standards for learning objects focus primarily on content presentation. They were already extended to support automatic evaluation but it is limited to exercises with a predefined set of answers. The existing standards lack the metadata required by specialized evaluators to handle types of exercises with an indefinite set of solutions. To address this issue existing learning object standards were extended to the particular requirements of a specialized domain. A definition of programming problems as learning objects, compatible both with Learning Management Systems and with systems performing automatic evaluation of programs, is presented in this paper. The proposed definition includes metadata that cannot be conveniently represented using existing standards, such as: the type of automatic evaluation; the requirements of the evaluation engine; and the roles of different assets - tests cases, program solutions, etc. The EduJudge project and its main services are also presented as a case study on the use of the proposed definition of programming problems as learning objects.

  • 150
  • 192