2020
Authors
Guimaraes, N; Figueira, A; Torgo, L;
Publication
Communications in Computer and Information Science
Abstract
The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks. © 2020, Springer Nature Switzerland AG.
2020
Authors
Guimaraes, N; Figueira, A; Torgo, L;
Publication
PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST)
Abstract
The growth of social media as an information medium without restrictive measures on the creation of new accounts led to the rise of malicious agents with the intend to diffuse unreliable information in the network, ultimately affecting the perception of users in important topics such as political and health issues. Although the problem is being tackled within the domain of bot detection, the impact of studies in this area is still limited due to 1) not all accounts that spread unreliable content are bots, 2) human-operated accounts are also responsible for the diffusion of unreliable information and 3) bot accounts are not always malicious (e.g. news aggregators). Also, most of these methods are based on supervised models that required annotated data and updates to maintain their performance through time. In this work, we build a framework and develop knowledge-based metrics to complement the current research in bot detection and characterize the impact and behavior of a Twitter account, independently of the way it is operated (human or bot). We proceed to analyze a sample of the accounts using the metrics proposed and evaluate the necessity of these metrics by comparing them with the scores from a bot detection system. The results show that the metrics can characterize different degrees of unreliable accounts, from unreliable bot accounts with a high number of followers to human-operated accounts that also spread unreliable content (but with less impact on the network). Furthermore, evaluating a sample of the accounts with a bot detection system shown that bots compose around 11% of the sample of unreliable accounts extracted and that the bot score is not correlated with the proposed metrics. In addition, the accounts that achieve the highest values in our metrics present different characteristics than the ones that achieve the highest bot score. This provides evidence on the usefulness of our metrics in the evaluation of unreliable accounts in social networks. Copyright
2020
Authors
Torres, A; Miranda, C;
Publication
EXPLORING SERVICE SCIENCE (IESS 2020)
Abstract
Service Design (SD) and Design Thinking (DT) evolved in the last decade and have become popular in the research field of service science. However, the application of SD and DT research outcomes into practice is still scarce. To help understanding the differences between research and practice, we conducted 20 semi-structured interviews with professionals and trainees from four organizations that are involved in service innovation projects. The results reveal several similarities and complementarities, (dis)advantages, requests and obstacles, which hinder companies from implementing and using structured SD and DT approaches. The findings present some challenges for both researchers and practitioners on actions they could take to overcome barriers and foster the SD and DT practice within organizations.
2020
Authors
Pereira, FSF; Andrade, T; de Carvalho, ACPLF;
Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II
Abstract
We present a solution submitted to the Social Media and Harassment Competition held in collaboration with ECML PKDD 2019 Conference. The dataset used is as set of tweets and the first task was on the detection of harassment tweets. To deal with this problem, we proposed a solution based on a gradient tree-boosting algorithm. The second task was categorization harassment tweets according to the type of harassment, a multiclass classification problem. For this problem we proposed a LSTM network model. The solutions proposed for these tasks presented good predictive accuracy.
2020
Authors
Silva, PR;
Publication
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20)
Abstract
With the advances of the big data era in biology, deep learning have been incorporated in analysis pipelines trying to transform biological information into valuable knowledge. Deep learning demonstrated its power in promoting bioinformatics field including sequence analysis, bio-molecular property and function prediction, automatic medical diagnosis and to analyse cell imaging data. The ambition of this work is to create an approach that can fully explore the relationships across modalities and subjects through mining and fusing features from multi-modality data for cell state classification. The system should be able to classify cell state through multimodal deep learning techniques using heterogeneous data such as biological images, genomics and clinical annotations. Our pilot study addresses the data acquisition process and the framework capable to extract biological parameters from cell images.
2020
Authors
Sousa, A; Ferreira, M; Oliveira, C; Ferreira, PG;
Publication
FRONTIERS IN GENETICS
Abstract
Cancer has an important and considerable gender differential susceptibility confirmed by several epidemiological studies. Gastric (GC) and thyroid cancer (TC) are examples of malignancies with a higher incidence in males and females, respectively. Beyond environmental predisposing factors, it is expected that gender-specific gene deregulation contributes to this differential incidence. We performed a detailed characterization of the transcriptomic differences between genders in normal and tumor tissues from stomach and thyroid using Genotype-Tissue Expression (GTEx) and The Cancer Genome Atlas (TCGA) data. We found hundreds of sex-biased genes (SBGs). Most of the SBGs shared by normal and tumor belong to sexual chromosomes, while the normal and tumor-specific tend to be found in the autosomes. Expression of several cancer-associated genes is also found to differ between sexes in both types of tissue. Thousands of differentially expressed genes (DEGs) between paired tumor-normal tissues were identified in GC and TC. For both cancers, in the most susceptible gender, the DEGs were mostly under-expressed in the tumor tissue, with an enrichment for tumor-suppressor genes (TSGs). Moreover, we found gene networks preferentially associated to males in GC and to females in TC and correlated with cancer histological subtypes. Our results shed light on the molecular differences and commonalities between genders and provide novel insights in the differential risk underlying these cancers.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.