Publications

Publications by Inês Dutra

2011

STUDYING THE RELEVANCE OF BREAST IMAGING FEATURES

Authors
Ferreira, P; Dutra, I; Fonseca, NA; Woods, R; Burnside, E;

Publication
HEALTHINF 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HEALTH INFORMATICS

Abstract
Breast screening is the regular examination of a woman's breasts to find breast cancer in an initial stage. The sole exam approved for this purpose is mammography that, despite the existence of more advanced technologies, is considered the cheapest and most efficient method to detect cancer in a preclinical stage. We investigate, using machine learning techniques, how attributes obtained from mammographies can relate to malignancy. In particular, this study focus is on how mass density can influence malignancy from a data set of 348 patients containing, among other information, results of biopsies. To this end, we applied different learning algorithms on the data set using the WEKA tools, and performed significance tests on the results. The conclusions are threefold: (1) automatic classification of a mammography can reach equal or better results than the ones annotated by specialists, which can help doctors to quickly concentrate on some specific mammogram for a more thorough study; (2) mass density seems to be a good indicator of malignancy, as previous studies suggested; (3) we can obtain classifiers that can predict mass density with a quality as good as the specialist blind to biopsy.

CloseRead Abstract

2009

UbiDis: a Flexible and General top-level Middleware to Manage Applications in Grids and Clusters

Authors
Fonseca, NA; Dutra, I;

Publication
IBERGRID: 3RD IBERIAN GRID INFRASTRUCTURE CONFERENCE PROCEEDINGS

Abstract
From an application point of view, the Grid computing with its powerful processing power and large amounts of data storage offers the possibility to process large quantities of data, to run computationally-intensive operations, or both. For instance, in computational biological pipelines, one often has to process large quantities of data in individually computationally-intensive operations. To process this data in the Grid, hundreds, or even thousands of jobs need to be submitted and their results processed. Obviously, performing these tasks manually is unfeasible. On the other hand, developing software to this end, specifically for a single application, is unproductive because if the application changes, or the Grid submission engine changes, then the code needs to be rewritten. In this paper we present a middleware that facilitates the submission of jobs to grids (or clusters) and helps handling their results. The middleware, that we call UbiDis (Ubiquitous Distribution), copies all files necessary for running the program to the UI or front-end host (in a Grid or cluster), compiles programs on the UI or front-end (if necessary), generates and submits the jobs, and copies the outputs to the local machine. Furthermore, UbiDis transparently generates jobs to different job managers, allowing the user to easily and quickly change the location to where the jobs are submitted. Finally, we illustrate the usefulness of UbiDis using two applications.

CloseRead Abstract

2007

Grid applications in EELA

Authors
Abarca, R; Acero, A; Aparicio, G; Baeza, C; Barbera, R; Blanco, F; Blanquer, I; Carrillo, M; Luis Chaves, JL; Cofino, A; Cruz, J; Diniz, M; Domingues, G; Teresa Dova, MT; Dutra, I; Echeverria, F; Enriquez, L; Fernandez Lima, F; Fernandez Nodarse, F; Fernandez, M; Fernandez, V; Franca, F; Manuel Gutierrez, JM; Hernandez, A; Hernandez, V; Isea, R; Lima, P; Lopez, D; Mayo, R; Miguel, R; Montes, E; Ricardo Mora, HR; Moreveli Espinoza, M; Nellen, L; Pereira, G; Pezoa, R; Porto, A; Salinas, L; Silva, E; Tolla, C;

Publication
IBERGRID: 1ST IBERIAN GRID INFRASTRUCTURE CONFERENCE PROCEEDINGS

Abstract
Several international Projects and Collaborations have emerged in the last years due to the increasing demand for Grid resources. One important aspect of these initiatives deals with the gridification of computing intensive scientific applications otherwise difficult to run efficiently. The EELA Project (E-Infrastructure shared between Europe and Latin America) is a collaboration of Latin America and Europe Institutions which has developed a performance e-Infrastructure for e-Science applications in the fields of Biomedicine, High Energy Physics, e-Learning and Climate. Nowadays many groups have already ported their applications on the EELA Grid and are obtaining first results. This paper describes the first year of EELA and the progress achieved so far.

CloseRead Abstract

1998

VisAll: A universal tool to visualise the parallel execution of logic programs

Authors
Fonseca, N; Costa, VS; Dutra, ID;

Publication
LOGIC PROGRAMMING - PROCEEDINGS OF THE 1998 JOINT INTERNATIONAL CONFERENCE AND SYMPOSIUM ON LOGIC PROGRAMMING

Abstract
One of the most important advantages of logic programming systems is that they allow the transparent exploitation of parallelism. The different forms of parallelism available and the complex nature of logic programming applications present interesting problems to both the users and the developers of these systems. Graphical visualisation tools can give a particularly important contribution, as they are easier to understand than text based tools, and allow both for a general overview of an execution and for focusing on its important details. Towards these goals, we propose VisAll, anew tool to visualise the parallel execution of logic programs. VisAll benefits from a modular design centered in a graph that represents a parallel execution. A main graphical shell commands the different modules and presents VisAll as an unified system. Several input components, or translators, support the well-known VisAndor and VACE trace formats, plus a new format designed for independent and-parallel plus or-parallel execution in the SEA. Several output components, or visualisers, allow for different visualisations of the same execution.

CloseRead Abstract

2005

Knowledge Discovery from Structured Mammography Reports Using Inductive Logic Programming

Authors
Burnside, ElizabethS.; Davis, Jesse; Costa, VitorSantos; Dutra, InesdeCastro; Jr., CharlesE.Kahn; Fine, Jason; Page, David;

Publication
AMIA 2005, American Medical Informatics Association Annual Symposium, Washington, DC, USA, October 22-26, 2005

Abstract
The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subsequently be tested and validated. The ILP algorithm discovered two hypotheses from the data that were 1) judged as interesting by a subspecialty trained mammographer and 2) validated by analysis of the data itself.

CloseRead Abstract

1997

Evaluating parallel logic programming systems on scalable multiprocessors

Authors
Costa, VS; Bianchini, R; de, CDI;

Publication
International Symposium on Parallel Symbolic Computation, Proceedings, PASCO

Abstract
Parallel logic programming systems are sophisticated examples of symbolic computing systems. They address problems such as dynamic memory allocation, scheduling irregular execution patterns, and managing different types of implicit parallelism. Most parallel logic programming systems have been developed for bus-based shared-memory architectures. The complexity of parallel logic programming systems and the large amount of data they process raises the question of whether logic programming systems can still obtain good performance on scalable architectures, such as distributed shared-memory systems. In this work we use execution-driven simulation to investigate the access patterns and caching behaviour exhibited by a parallel logic programming system, Andorra-I. We show that the system obtains reasonable performance, but that it does not scale well. By studying the behaviour of the major data structures in Andorra-I in detail, we conclude that this result is largely a consequence of the scheduling and work manipulation implementation used in the system. We also show that the Andorra-I's data structures exhibit widely-varying memory access patterns and caching behaviour, which not only depend on the number of processors, but also on the amount and type of parallelism available in the application program. Some of these data structures clearly favour invalidate-based cache coherence protocols, while others favour update-based protocols. Since most of Andorra-I's data structures are common to other parallel logic programming systems, we believe that these systems can greatly benefit from flexible coherence schemes where either the compiler can specify the protocol to be used for each data structure or the protocol can adapt to varying memory access patterns.

CloseRead Abstract