Publicacoes - INESC TEC

Publicações

Publicações por Nuno Ricardo Guimarães

2017

An architecture for a continuous and exploratory analysis on social media

Autores
Cunha, D; Guimarães, N; Figueira, A;

Publicação
Proceedings of the International Conferences on Computer Graphics, Visualization, Computer Vision and Image Processing 2017 and Big Data Analytics, Data Mining and Computational Intelligence 2017 - Part of the Multi Conference on Computer Science and Information Systems 2017

Abstract
Social networks as Facebook and Twitter gained a remarkable attention in the last decade. A huge amount of data is emerging and posted everyday by users that are becoming more interested in and relying on social network for information, news and opinions. Real time posting came to rise and turned easier to report news and events. However, due to its dimensions, in this work we focus on building a system architecture capable of detecting journalistic relevance of posts automatically on this 'haystack' full of data. More specifically, users will have the change to interact with a 'friendly user interface' which will provide several tools to analyze data. © 2017.

FecharLer Abstract

2018

Contributions to the Detection of Unreliable Twitter Accounts through Analysis of Content and Behaviour

Autores
Guimarães, N; Figueira, A; Torgo, L;

Publicação
Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2018, Volume 1: KDIR, Seville, Spain, September 18-20, 2018.

Abstract
Misinformation propagation on social media has been significantly growing, reaching a major exposition in the 2016 United States Presidential Election. Since then, the scientific community and major tech companies have been working on the problem to avoid the propagation of misinformation. For this matter, research has been focused on three major sub-fields: the identification of fake news through the analysis of unreliable posts, the propagation patterns of posts in social media, and the detection of bots and spammers. However, few works have tried to identify the characteristics of a post that shares unreliable content and the associated behaviour of its account. This work presents four main contributions for this problem. First, we provide a methodology to build a large knowledge database with tweets who disseminate misinformation links. Then, we answer research questions on the data with the goal of bridging these problems to similar problem explored in the literature. Next, we focus on accounts which are constantly propagating misinformation links. Finally, based on the analysis conducted, we develop a model to detect social media accounts that spread unreliable content. Using Decision Trees, we achieved 96% in the F1-score metric, which provides reliability on our approach. Copyright 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

FecharLer Abstract

2018

Current State of the Art to Detect Fake News in Social Media: Global Trendings and Next Challenges

Autores
Figueira, A; Guimarães, N; Torgo, L;

Publicação
Proceedings of the 14th International Conference on Web Information Systems and Technologies, WEBIST 2018, Seville, Spain, September 18-20, 2018.

Abstract
Nowadays, false news can be created and disseminated easily through the many social media platforms, resulting in a widespread real-world impact. Modeling and characterizing how false information proliferates on social platforms and why it succeeds in deceiving readers are critical to develop efficient algorithms and tools for their early detection. A recent surge of researching in this area has aimed to address the key issues using methods based on machine learning, deep learning, feature engineering, graph mining, image and video analysis, together with newly created data sets and web services to identify deceiving content. Majority of the research has been targeting fake reviews, biased messages, and against-facts information (false news and hoaxes). In this work, we present a survey on the state of the art concerning types of fake news and the solutions that are being proposed. We focus our survey on content analysis, network propagation, fact-checking and fake news analysis and emerging detection systems. We also discuss the rationale behind successfully deceiving readers. Finally, we highlight important challenges that these solutions bring. Copyright

FecharLer Abstract

2019

A Brief Overview on the Strategies to Fight Back the Spread of False Information

Autores
Figueira, A; Guirnaraes, N; Torgo, L;

Publicação
JOURNAL OF WEB ENGINEERING

Abstract
The proliferation of false information on social networks is one of the hardest challenges in today's society, with implications capable of changing users perception on what is a fact or rumor. Due to its complexity, there has been an overwhelming number of contributions from the research community like the analysis of specific events where rumors are spread, analysis of the propagation of false content on the network, or machine learning algorithms to distinguish what is a fact and what is "fake news". In this paper, we identify and summarize some of the most prevalent works on the different categories studied. Finally, we also discuss the methods applied to deceive users and what are the next main challenges of this area.

FecharLer Abstract

2019

A System to Automatically Predict Relevance in Social Media

Autores
Figueira, A; Guimaraes, N; Pinto, J;

Publicação
CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES

Abstract
The rise of online social networks has reshaped the way information is published and spread. Users can now post in an effortless way and in any location, making this medium ideal for searching breaking news and journalistic relevant content. However, due to the overwhelming number of posts published every second, such content is hard to trace. Thus, it is important to develop methods able to detect and analyze whether a certain text contains journalistic relevant information. Furthermore, it is also important that this detection system can provide additional information towards a better comprehension of the prediction made. In this work, we overview our system, based on an ensemble classifier that is able to predict if a certain post is relevant from a journalistic perspective which outperforms the previous relevant systems in their original datasets. In addition, we describe REMINDS: a web platform built on top of our relevance system that is able to provide users with the visualization of the system's features as well as additional information on the text, ultimately leading to a better comprehension of the system's prediction capabilities. (C) 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of the CENTERIS -International Conference on ENTERprise Information Systems / ProjMAN - International Conference on Project MANagement / HCist - International Conference on Health and Social Care Information Systems and Technologies.

FecharLer Abstract

2020

Identifying journalistically relevant social media texts using human and automatic methodologies

Autores
Guimaraes, N; Miranda, F; Figueira, A;

Publicação
INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING

Abstract
Social networks have provided the means for constant connectivity and fast information dissemination. In addition, real-time posting allows a new form of citizen journalism, where users can report events from a witness perspective. Therefore, information propagates through the network at a faster pace than traditional media reports it. However, relevant information is a small percentage of all the content shared. Our goal is to develop and evaluate models that can automatically detect journalistic relevance. To do it, we need solid and reliable ground truth data with a significantly large quantity of annotated posts, so that the models can learn to detect relevance over all the spectrum. In this article, we present and confront two different methodologies: an automatic and a human approach. Results on a test data set labelled by experts' show that the models trained with automatic methodology tend to perform better in contrast to the ones trained using human annotated data.

FecharLer Abstract