Publicacoes - INESC TEC

Publicações

Publicações por Mário João Antunes

2021

Exposing Manipulated Photos and Videos in Digital Forensics Analysis

Autores
Ferreira, S; Antunes, M; Correia, ME;

Publicação
JOURNAL OF IMAGING

Abstract
Tampered multimedia content is being increasingly used in a broad range of cybercrime activities. The spread of fake news, misinformation, digital kidnapping, and ransomware-related crimes are amongst the most recurrent crimes in which manipulated digital photos and videos are the perpetrating and disseminating medium. Criminal investigation has been challenged in applying machine learning techniques to automatically distinguish between fake and genuine seized photos and videos. Despite the pertinent need for manual validation, easy-to-use platforms for digital forensics are essential to automate and facilitate the detection of tampered content and to help criminal investigators with their work. This paper presents a machine learning Support Vector Machines (SVM) based method to distinguish between genuine and fake multimedia files, namely digital photos and videos, which may indicate the presence of deepfake content. The method was implemented in Python and integrated as new modules in the widely used digital forensics application Autopsy. The implemented approach extracts a set of simple features resulting from the application of a Discrete Fourier Transform (DFT) to digital photos and video frames. The model was evaluated with a large dataset of classified multimedia files containing both legitimate and fake photos and frames extracted from videos. Regarding deepfake detection in videos, the Celeb-DFv1 dataset was used, featuring 590 original videos collected from YouTube, and covering different subjects. The results obtained with the 5-fold cross-validation outperformed those SVM-based methods documented in the literature, by achieving an average F1-score of 99.53%, 79.55%, and 89.10%, respectively for photos, videos, and a mixture of both types of content. A benchmark with state-of-the-art methods was also done, by comparing the proposed SVM method with deep learning approaches, namely Convolutional Neural Networks (CNN). Despite CNN having outperformed the proposed DFT-SVM compound method, the competitiveness of the results attained by DFT-SVM and the substantially reduced processing time make it appropriate to be implemented and embedded into Autopsy modules, by predicting the level of fakeness calculated for each analyzed multimedia file.

FecharLer Abstract

2021

A Dataset of Photos and Videos for Digital Forensics Analysis Using Machine Learning Processing

Autores
Ferreira, S; Antunes, M; Correia, ME;

Publicação
DATA

Abstract
Deepfake and manipulated digital photos and videos are being increasingly used in a myriad of cybercrimes. Ransomware, the dissemination of fake news, and digital kidnapping-related crimes are the most recurrent, in which tampered multimedia content has been the primordial disseminating vehicle. Digital forensic analysis tools are being widely used by criminal investigations to automate the identification of digital evidence in seized electronic equipment. The number of files to be processed and the complexity of the crimes under analysis have highlighted the need to employ efficient digital forensics techniques grounded on state-of-the-art technologies. Machine Learning (ML) researchers have been challenged to apply techniques and methods to improve the automatic detection of manipulated multimedia content. However, the implementation of such methods have not yet been massively incorporated into digital forensic tools, mostly due to the lack of realistic and well-structured datasets of photos and videos. The diversity and richness of the datasets are crucial to benchmark the ML models and to evaluate their appropriateness to be applied in real-world digital forensics applications. An example is the development of third-party modules for the widely used Autopsy digital forensic application. This paper presents a dataset obtained by extracting a set of simple features from genuine and manipulated photos and videos, which are part of state-of-the-art existing datasets. The resulting dataset is balanced, and each entry comprises a label and a vector of numeric values corresponding to the features extracted through a Discrete Fourier Transform (DFT). The dataset is available in a GitHub repository, and the total amount of photos and video frames is 40,588 and 12,400, respectively. The dataset was validated and benchmarked with deep learning Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) methods; however, a plethora of other existing ones can be applied. Generically, the results show a better F1-score for CNN when comparing with SVM, both for photos and videos processing. CNN achieved an F1-score of 0.9968 and 0.8415 for photos and videos, respectively. Regarding SVM, the results obtained with 5-fold cross-validation are 0.9953 and 0.7955, respectively, for photos and videos processing. A set of methods written in Python is available for the researchers, namely to preprocess and extract the features from the original photos and videos files and to build the training and testing sets. Additional methods are also available to convert the original PKL files into CSV and TXT, which gives more flexibility for the ML researchers to use the dataset on existing ML frameworks and tools.

FecharLer Abstract

2021

Forensic Analysis of Tampered Digital Photos

Autores
Ferreira, S; Antunes, M; Correia, ME;

Publicação
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
Deepfake in multimedia content is being increasingly used in a plethora of cybercrimes, namely those related to digital kidnap, and ransomware. Criminal investigation has been challenged in detecting manipulated multimedia material, by applying machine learning techniques to distinguish between fake and genuine photos and videos. This paper aims to present a Support Vector Machines (SVM) based method to detect tampered photos. The method was implemented in Python and integrated as a new module in the widely used digital forensics application Autopsy. The method processes a set of features resulting from the application of a Discrete Fourier Transform (DFT) in each photo. The experiments were made in a new and large dataset of classified photos containing both legitimate and manipulated photos, and composed of objects and faces. The results obtained were promising and reveal the appropriateness of using this method embedded in Autopsy, to help in criminal investigation activities and digital forensics.

FecharLer Abstract

2021

An Integrated Cybernetic Awareness Strategy to Assess Cybersecurity Attitudes and Behaviours in School Context

Autores
Antunes, M; Silva, C; Marques, F;

Publicação
APPLIED SCIENCES-BASEL

Abstract
Digital exposure to the Internet among the younger generations, notwithstanding their digital abilities, has increased and raised the alarm regarding the need to intensify the education on cybersecurity in schools. Understanding of the human factor and its influence on children, namely their attitudes and behaviors online, is pivotal to reinforce their awareness towards cyberattacks, and to promote their digital citizenship. This paper aims to present an integrated cybersecurity and cyberawareness strategy composed of three major steps: (1) Cybersecurity attitude and behavior assessment, (2) self-diagnosis, and (3) teaching/learning activities. The following contributions are made: Two questionnaires to assess risky attitudes and behaviors regarding cybersecurity; a self-diagnosis to measure students' skills on cybersecurity; a lesson plan addressing cyberawareness to be applied on Information and Communications Technology (ICT) and citizenship education curricular units. Cybersecurity risky attitudes and behaviors were evaluated in a junior high school population of 164 students attending the sixth and ninth grades. The assessment focused on two main subjects: To identify the attitudes and behaviors that raise the risk on cybersecurity among the participating students; to characterize the acquired students' cybersecurity and cyberawareness skills. Global and individual scores and the histograms for attitudes and behaviors are presented. The items in which we have observed significant differences between sixth and ninth grades are depicted and quantified by their corresponding p-values obtained through the Mann-Whitney non-parametric test. Regarding the results obtained on the assessment of attitudes and behaviors, although positive, we observed that the attitudes and behaviors in ninth grade students are globally inferior compared to those attained by sixth grade students. The deployed strategy for cyberawareness was applied in a school context; however, the same approach is suitable to be applied in other types of organizations, namely enterprises, healthcare institutions and public sector.

FecharLer Abstract

2020

Proceedings of the Tenth International Conference on Soft Computing and Pattern Recognition, SoCPaR 2018, Porto, Portugal, December 13-15, 2018

Autores
Madureira, AM; Abraham, A; Gandhi, N; Silva, C; Antunes, M;

Publicação
SoCPaR

Abstract

2021

A Customizable Web Platform to Manage Standards Compliance of Information Security and Cybersecurity Auditing

Autores
Antunes, M; Maximiano, M; Gomes, R;

Publicação
CENTERIS 2021 - International Conference on ENTERprise Information Systems / ProjMAN 2021 - International Conference on Project MANagement / HCist 2021 - International Conference on Health and Social Care Information Systems and Technologies 2021, Braga, Portugal

Abstract
Information security and cybersecurity are key subjects in modern enterprises' management, being ISO-27001:2013, NIST Cybersecurity Framework and ISO-27009 some of the most implemented international frameworks and standards. Their main goal is to globally reduce the risk, by leveraging enterprises' competitiveness in global markets and enhancing business processes and collaborators' cyber awareness. Auditing processes examine and assess a list of predefined controls. For each control, a set of corrective measures could be proposed, to increase its compliance with the standard being used. These processes are time-consuming, involve on-site intervention by specialized consulting teams on the intervened enterprises, and a set of status reports of all the interventions should be elaborated and delivered. The existing auditing information systems are not developed to meet Small and Medium-sized Enterprises (SME) requirements, as they are mostly proprietary and expensive, ground usually on off-the-shelf applications, and are not generic to be used by several standards with different checklists and auditing methodologies. In this paper, a generic and web-integrated cybersecurity auditing information system is described. Its architecture, design, and data model enable it to be used in a wide set of auditing processes, by loading a predefined controls checklist assessment and their corresponding mitigation tasks list. It was designed to meet both SMEs and large enterprises' requirements, and stores auditing and intervention-related data in a relational database. The information system was tested on an ISO-27001:2013 information security auditing project, which has integrated fifty SMEs. The results obtained during the project are promising and reveal the appropriateness of using this information system in further similar auditing processes.

FecharLer Abstract