Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Pedro Manuel Ribeiro

2012

Motif Mining in Weighted Networks

Authors
Choobdar, S; Ribeiro, P; Silva, F;

Publication
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012)

Abstract
Unexpectedly frequent subgraphs, known as motifs, can help in characterizing the structure of complex networks. Most of the existing methods for finding motifs are designed for unweighted networks, where only the existence of connection between nodes is considered, and not their strength or capacity. However, in many real world networks, edges contain more information than just simple node connectivity. In this paper, we propose a new method to incorporate edge weight information in motif mining. We think of a motif as a subgraph that contains unexpected information, and we define a new significance measurement to assess this subgraph exceptionality. The proposed metric embeds the weight distribution in subgraphs and it is based on weight entropy. We use the g-trie data structure to find instances of k-sized subgraphs and to calculate its significance score. Following a statistical approach, the random entropy of subgraphs is then calculated, avoiding the time consuming step of random network generation. The discrimination power of the derived motif profile by the proposed method is assessed against the results of the traditional unweighted motifs through a graph classification problem. We use a set of labeled ego networks of co-authorship in the biology and mathematics fields. The new proposed method is shown to be feasible, achieving even slightly better accuracy. Since it does not require the generation of random networks, it is also computationally faster, and because we are able to use the weight information in computing the motif importance, we can avoid converting weighted networks into unweighted ones.

2009

Parallel calculation of multi-electrode array correlation networks

Authors
Ribeiro, P; Simonotto, J; Kaiser, M; Silva, F;

Publication
JOURNAL OF NEUROSCIENCE METHODS

Abstract
When calculating correlation networks from multi-electrode array (MEA) data, one works with extensive computations. Unfortunately, as the MEAs grow bigger, the time needed for the computation grows even more: calculating pair-wise correlations for current 60 channel systems can take hours on normal commodity computers whereas for future 1000 channel systems it would take almost 280 times as long, given that the number of pairs increases with the square of the number of channels. Even taking into account the increase of speed in processors, soon it can be unfeasible to compute correlations in a single computer. Parallel computing is a way to sustain reasonable calculation times in the future. We provide a general tool for rapid computation of correlation networks which was tested for: (a) a single computer cluster with 16 cores, (b) the Newcastle Condor System utilizing idle processors of university computers and (c) the inter-cluster, with 192 cores. Our reusable tool provides a simple interface for neuroscientists, automating data partition and job submission, and also allowing coding in any programming language. It is also sufficiently flexible to be used in other high-performance computing environments.

2010

Efficient Subgraph Frequency Estimation with G-Tries

Authors
Ribeiro, P; Silva, F;

Publication
ALGORITHMS IN BIOINFORMATICS

Abstract
Many biological networks contain recurring overrepresented elements, called network motifs. Finding these substructures is a computationally hard task related to graph isomorphism. G-Tries are an efficient data structure, based on multiway trees, capable of efficiently identifying common substructures in a set of subgraphs. They are highly successful in constraining the search space when finding the occurrences of those subgraphs in a larger original graph. This leads to speedups up to 100 times faster than previous methods that aim for exact and complete results. In this paper we present a new efficient sampling algorithm for subgraph frequency estimation based on g-tries. It is able to uniformly traverse a fraction of the search space, providing an accurate unbiased estimation of subgraph frequencies. Our results show that in the same amount of time our algorithm achieves better precision than previous methods, as it is able to sustain higher sampling speeds.

2011

Network Node Label Acquisition and Tracking

Authors
Choobdar, S; Silva, F; Ribeiro, P;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
Complex networks are ubiquitous in real-world and represent a multitude of natural and artificial systems. Some of these networks are inherently dynamic and their structure changes over time, but only recently has the research community been trying to better characterize them. In this paper we propose a novel general methodology to characterize time evolving networks, analyzing the dynamics of their structure by labeling the nodes and tracking how these labels evolve. Node labeling is formulated as a clustering task that assigns a classification to each node according to its local properties. Association rule mining is then applied to sequences of nodes' labels to extract useful rules that best describe changes in the network. We evaluate our method using two different networks, a real-world network of the world annual trades and a synthetic scale-free network, in order to uncover evolution patterns. The results show that our approach is valid and gives insights into the dynamics of the network. As an example, the derived rules for the scale-free network capture the properties of preferential node attachment.

2012

Event Detection in Evolving Networks

Authors
Choobdar, S; Ribeiro, P; Silva, F;

Publication
2012 FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS (CASON)

Abstract
This paper describes a methodology for finding and describing significant events in time evolving complex networks. We first group the nodes of the network in clusters, according to their similarity in terms of a set of local properties such as degree and clustering coefficient. We then monitor the behavior of these groups over time, looking for significant changes on the size of the groups. These events are notable since they show that the position of a number of nodes in the network has changed. We describe this evolution by extracting the correspondent transition patterns. We examined our methodology on three different real network datasets. Our experiments show that the discovered rules are significant and can describe the occurring events.

2012

Parallel discovery of network motifs

Authors
Ribeiro, P; Silva, F; Lopes, L;

Publication
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING

Abstract
Many natural structures can be naturally represented by complex networks. Discovering network motifs, which are overrepresented patterns of inter-connections, is a computationally hard task related to graph isomorphism. Sequential methods are hindered by an exponential execution time growth when we increase the size of motifs and networks. In this article we study the opportunities for parallelism in existing methods and propose new parallel strategies that adapt and extend one of the most efficient serial methods known from the Fanmod tool. We propose both a master-worker strategy and one with distributed control, in which we employ a randomized receiver initiated methodology capable of providing dynamic load balancing during the whole computation process. Our strategies are capable of dealing both with exact and approximate network motif discovery. We implement and apply our algorithms to a set of representative networks and examine their scalability up to 128 processing cores. We obtain almost linear speedups, showcasing the efficiency of our proposed approach and are able to reach motif sizes that were not previously achievable using conventional serial algorithms.

  • 10
  • 12