2023
Autores
Mori, A; Paiva, ACR; Souza, SRS;
Publicação
PROCEEDINGS OF THE 8TH BRAZILIAN SYMPOSIUM ON SYSTEMATIC AND AUTOMATED SOFT-WARE TESTING, SAST 2023
Abstract
Regression testing is a software engineering maintenance activity that involves re-executing test cases on a modified software system to check whether code changes introduce new faults. However, it can be time-consuming and resource-intensive, especially for large systems. Regression testing selection techniques can help address this issue by selecting a subset of test cases to run. The change-based technique selects a subset of test cases based on the modified software classes, reducing the test suite size. Thereby, it will cover a smaller number of classes, decreasing the efficiency of the test suite to reveal design flaws. From this perspective, code smells are known to identify poor design and threaten the quality of software systems. In this study, we propose an approach to combine code change and smell to select regression tests and present two new techniques: code smell based and code change and smell. Additionally, we developed the Regression Testing Selection Tool (RTST) to automate the selection process. We empirically evaluated the approach in Defects4J projects by comparing the new techniques' effectiveness with the change-based as a baseline. The results show that the change-based technique achieves the highest reduction rate in the test suite size but with less class coverage. On the other hand, test cases selected using code smells and changed classes combined can potentially find more bugs. The code smell-based technique provides a comparable class coverage to the code change and smell approach. Our findings highlight the benefits of incorporating code smells in regression testing selection and suggest opportunities for improving the efficiency and effectiveness of regression testing.
2023
Autores
Grine, T; Lopes, CT;
Publicação
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I
Abstract
In a world increasingly present online, people are leaving a digital footprint, with valuable information scattered on the Web, in an unstructured manner, beholden to the websites that keep it. While there are potential harms in being able to access this information readily, such as enabling corporate surveillance, there are also significant benefits when used, for example, in journalism or investigations into Human Trafficking. This paper presents an approach for retrieving domain-specific information present on the Web using Social Media platforms as a gateway to other content existing on any website. It begins by identifying relevant profiles, then collecting links shared in posts to webpages related to them, and lastly, extracting and indexing the information gathered. The tool developed based on this approach was tested for a case study in the domain of Human Trafficking, more specifically in sexual exploitation, showing promising results and potential to be applied in a real-world scenario.
2023
Autores
Rodrigues J.; Lopes C.T.;
Publicação
Open Information Science
Abstract
Research data management is essential for safeguarding and prospecting data generated in a scientific context. Specific issues arise regarding data in image format, as this data typology poses particular challenges and opportunities; however, not much attention has been given to data as images. We reviewed 109 articles from several research domains where images were used either as data or metadata to understand how researchers specifically deal with this data format, and what are your habits and behaviors. We use the Web of Science (WoS), considering its five main areas of research. We included in the initial corpus the most relevant articles by research domain, selecting the ten most cited articles in WoS, by year, between 2010 and 2021. The selected articles should be in English and in open access. The results found that images have been used in scientific works numerous times, but, unfortunately, few are those in which they are the central element of the study. Photography is the type of image most used in most domains. In terms of the instruments used, the Technology and Life Sciences and Biomedicine domains use the microscope more, while the Arts and Humanities and Physical Sciences domains use the camera more. We found that the images are mostly produced in the context of the project, rather than reused by third parties. As for their collection scenario, these are mostly produced/used in a laboratory context. The overwhelming majority of the images present in the articles are digital, and only a small part is analog. We verify that Arts and Humanities are more likely to perform qualitative types of analyses, while Life Sciences and Biomedicine overwhelmingly use quantitative analyses. As for the issues of sharing and depositing, Life Sciences and Biomedicine is the domain that stands out the most in the tasks of depositing and sharing images. It was found that the licenses of a project are intrinsically related to the motivations for sharing results with third parties. Description, a fundamental step in the data management process, is neglected by a large number of researchers. The images are mostly not described or annotated and when this happens, researchers don't provide much detail about this.
2023
Autores
Oliveira, B; Lopes, CT;
Publicação
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR 2023, Austin, TX, USA, March 19-23, 2023
Abstract
Web search engines have marked everyone's life by transforming how one searches and accesses information. Search engines give special attention to the user interface, especially search engine result pages (SERP). The well-known "10 blue links"list has evolved into richer interfaces, often personalized to the search query, the user, and other aspects. More than 20 years later, the literature has not adequately portrayed this development. We present a study on the evolution of SERP interfaces during the last two decades using Google Search as a case study. We used the most searched queries by year to extract a sample of SERP from the Internet Archive. Using this dataset, we analyzed how SERP evolved in content, layout, design (e.g., color scheme, text styling, graphics), navigation, and file size. We have also analyzed the user interface design patterns associated with SERP elements. We found that SERP are becoming more diverse in terms of elements, aggregating content from different verticals and including more features that provide direct answers. This systematic analysis portrays evolution trends in search engine user interfaces and, more generally, web design. We expect this work will trigger other, more specific studies that can take advantage of our dataset.
2023
Autores
Oliveira, B; Lopes, CT;
Publicação
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR 2023, Austin, TX, USA, March 19-23, 2023
Abstract
Web Search Engine Results Pages (SERP) are one of the most well-known and used web pages. These pages have started as simple "10 blue links"pages, but the information in SERP currently goes way beyond these links. Several features have been included in these pages to complement organic and sponsored results and attempt to provide answers to the query instead of just pointing to websites that might deliver that information. In this work, we analyze the appearance and evolution of SERP features in the two leading web search engines, Google Search and Microsoft Bing. Using a sample of SERP from the Internet Archive, we analyzed the appearance and evolution of these features. We found that SERP are becoming more diverse in terms of elements, aggregating content from different verticals and including more features that provide direct answers.
2023
Autores
Rodrigues, J; Teixeira Lopes, C;
Publicação
Journal of Library Metadata
Abstract
Indispensable in many contexts, images are fundamental in the tasks of representation and transmission of information. In the scientific context, images can be tools for researchers seeking to see their data properly managed. Research data management guides in this direction as it determines necessary phases in the life cycle of projects. The description phase is fundamental as it is an essential means for data context, safeguarding, and reuse. The description often occurs through metadata models composed of descriptors capable of attributing context. However, there is one common aspect: the values associated with these descriptors are always textual or numeric. Through studies and work developed over the last few years, we propose a new approach to description, where images can have a preponderant role in the description of data, assuming the role of metadata. We present several pieces of evidence, point out their challenges and determine the opportunities this new perspective can have in the research. Images have specific characteristics that can be leveraged in improving data description. Historical evidence establish that images have always been used and produced in research, yet their representational ability has never been harnessed to describe data and give more context to the scientific process. ©, Joana Rodrigues and Carla Teixeira Lopes. Published with license by Taylor & Francis Group, LLC.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.