Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by HumanISE

2023

From ISAD(G) to Linked Data Archival Descriptions

Authors
Koch, I; Pires, C; Lopes, CT; Ribeiro, C; Nunes, S;

Publication
LINKING THEORY AND PRACTICE OF DIGITAL LIBRARIES, TPDL 2023

Abstract
Archives preserve materials that allow us to understand and interpret the past and think about the future. With the evolution of the information society, archives must take advantage of technological innovations and adapt to changes in the kind and volume of the information created. Semantic Web representations are appropriate for structuring archival data and linking them to external sources, allowing versatile access by multiple applications. ArchOnto is a new Linked Data Model based on CIDOC CRM to describe archival objects. ArchOnto combines specific aspects of archiving with the CIDOC CRM standard. In this work, we analyze the ArchOnto representation of a set of archival records from the Portuguese National Archives and compare it to their CIDOC CRM representation. As a result of ArchOnto's representation, we observe an increase in the number of classes used, from 20 in CIDOC CRM to 28 in ArchOnto, and in the number of properties, from 25 in CIDOC CRM to 28 in ArchOnto. This growth stems from the refinement of object types and their relationships, favouring the use of controlled vocabularies. ArchOnto provides higher readability for the information in archival records, keeping it in line with current standards.

2023

Moving from ISAD(G) to a CIDOC CRM-based Linked Data Model in the Portuguese Archives

Authors
Koch, I; Lopes, CT; Ribeiro, C;

Publication
ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE

Abstract
Archives are facing numerous challenges. On the one hand, archival assets are evolving to encompass digitized documents and increasing quantities of born-digital information in diverse formats. On the other hand, the audience is changing along with how it wishes to access archival material. Moreover, the interoperability requirements of cultural heritage repositories are growing. In this context, the Portuguese Archives started an ambitious program aiming to evolve its data model, migrate existing records, and build a new archival management system appropriate to both archival tasks and public access. The overall goal is to have a fine-grained and flexible description, more machine-actionable than the current one. This work describes ArchOnto, a linked open data model for archives, and rules for its automatic population from existing records. ArchOnto adopts a semantic web approach and encompasses the CIDOC Conceptual Reference Model and additional ontologies, envisioning interoperability with datasets curated by multiple communities of practice. Existing ISAD(G)-conforming descriptions are being migrated to the new model using the direct mappings provided here. We used a sample of 25 records associated with different description levels to validate the completeness and conformity of ArchOnto to existing data. This work is in progress and is original in several respects: (1) it is one of the first approaches to use CIDOC CRM in the context of archives, identifying problems and questions that emerged during the process and pinpointing possible solutions; (2) it addresses the balance in the model between the migration of existing records and the construction of new ones by archive professionals; and (3) it adopts an open world view on linking archival data to global information sources.

2023

Secure, Dynamic and Uncomplicated Licensing of Movies on a Blockchain Infrastructure

Authors
Santos, J; Amorim, I; Ulisses, A; Lopes, JC; Filipe, V;

Publication
2023 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN

Abstract
Nowadays, the consumption of media content has been growing rapidly and consistently, driven by an easy access to Video on Demand platforms. In this context, licensing is needed to ensure that filmmakers receive rightful payment for their content and ensure that their rights as content owners are respected. The traditional licensing process, which is heavily dependent on third parties (legal entities) to mediate the transaction, is very long, costly, and complex, which is a barrier to smaller independent filmmakers. The solution proposed in this work, to address this problem, is to create a business-to-business marketplace platform supported by a Blockchain licensing module. This module takes advantage of Blockchain technology to ensure the licensing requirements and to provide a secure, practical and straightforward way to license media in a decentralised paradigm. The result of this work was validated though a prototype, and a global assessment of the system's usability was performed using the System Usability Scale, where it got the best possible grade.

2023

Using Digital Tools to Study the Health of Adults Born Preterm at a Large Scale: e-Cohort Pilot Study

Authors
Lorthe, E; Santos, C; Ornelas, JP; Doetsch, JN; Marques, SCS; Teixeira, R; Santos, AC; Rodrigues, C; Goncalves, G; Sousa, PF; Lopes, JC; Rocha, A; Barros, H;

Publication
JOURNAL OF MEDICAL INTERNET RESEARCH

Abstract
Background: Preterm birth is a global health concern. Its adverse consequences may persist throughout the life course, exerting a potentially heavy burden on families, health systems, and societies. In high-income countries, the first children who benefited from improved care are now adults entering middle age. However, there is a clear gap in the knowledge regarding the long-term outcomes of individuals born preterm. Objective: This study aimed to assess the feasibility of recruiting and following up an e-cohort of adults born preterm worldwide and provide estimations of participation, characteristics of participants, the acceptability of questions, and the quality of data collected. Methods: We implemented a prospective, open, observational, and international e-cohort pilot study (Health of Adult People Born Preterm-an e-Cohort Pilot Study [HAPP-e]). Inclusion criteria were being an adult (aged =18 years), born preterm (<37 weeks of gestation), having internet access and an email address, and understanding at least 1 of the available languages. A large, multifaceted, and multilingual communication strategy was established. Between December 2019 and June 2021, inclusion and repeated data collection were performed using a secured web platform. We provided descriptive statistics regarding participation in the e-cohort, namely, the number of persons who registered on the platform, signed the consent form, initiated and completed the baseline questionnaire, and initiated and completed the follow-up questionnaire. We also described the main characteristics of the HAPP-e participants and provided an assessment of the quality of the data and the acceptability of sensitive questions. Results: As of December 31, 2020, a total of 1004 persons had registered on the platform, leading to 527 accounts with a confirmed email and 333 signed consent forms. A total of 333 participants initiated the baseline questionnaire. All participants were invited to follow-up, and 35.7% (119/333) consented to participate, of whom 97.5% (116/119) initiated the follow-up questionnaire. Completion rates were very high both at baseline (296/333, 88.9%) and at follow-up (112/116, 96.6%). This sample of adults born preterm in 34 countries covered a wide range of sociodemographic and health characteristics. The gestational age at birth ranged from 23+6 to 36+6 weeks (median 32, IQR 29-35 weeks). Only 2.1% (7/333) of the participants had previously participated in a cohort of individuals born preterm. Women (252/333, 75.7%) and highly educated participants (235/327, 71.9%) were also overrepresented. Good quality data were collected thanks to validation controls implemented on the web platform. The acceptability of potentially sensitive questions was excellent, as very few participants chose the I prefer not to say option when available. Conclusions: Although we identified room for improvement in specific procedures, this pilot study confirmed the great potential for recruiting a large and diverse sample of adults born preterm worldwide, thereby advancing research on adults born preterm.

2023

Applying Machine Learning to Estimate the Effort and Duration of Individual Tasks in Software Projects

Authors
Sousa, AO; Veloso, DT; Goncalves, HM; Faria, JP; Mendes Moreira, J; Graca, R; Gomes, D; Castro, RN; Henriques, PC;

Publication
IEEE ACCESS

Abstract
Software estimation is a vital yet challenging project management activity. Various methods, from empirical to algorithmic, have been developed to fit different development contexts, from plan-driven to agile. Recently, machine learning techniques have shown potential in this realm but are still underexplored, especially for individual task estimation. We investigate the use of machine learning techniques in predicting task effort and duration in software projects to assess their applicability and effectiveness in production environments, identify the best-performing algorithms, and pinpoint key input variables (features) for predictions. We conducted experiments with datasets of various sizes and structures exported from three project management tools used by partner companies. For each dataset, we trained regression models for predicting the effort and duration of individual tasks using eight machine learning algorithms. The models were validated using k-fold cross-validation and evaluated with several metrics. Ensemble algorithms like Random Forest, Extra Trees Regressor, and XGBoost consistently outperformed non-ensemble ones across the three datasets. However, the estimation accuracy and feature importance varied significantly across datasets, with a Mean Magnitude of Relative Error (MMRE) ranging from 0.11 to 9.45 across the datasets and target variables. Nevertheless, even in the worst-performing dataset, effort estimates aggregated to the project level showed good accuracy, with MMRE = 0.23. Machine learning algorithms, especially ensemble ones, seem to be a viable option for estimating the effort and duration of individual tasks in software projects. However, the quality of the estimates and the relevant features may depend largely on the characteristics of the available datasets and underlying projects. Nevertheless, even when the accuracy of individual estimates is poor, the aggregated estimates at the project level may present a good accuracy due to error compensation.

2023

Case Studies of Development of Verified Programs with Dafny for Accessibility Assessment

Authors
Faria, JP; Abreu, R;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Formal verification techniques aim at formally proving the correctness of a computer program with respect to a formal specification, but the expertise and effort required for applying formal specification and verification techniques and scalability issues have limited their practical application. In recent years, the tremendous progress with SAT and SMT solvers enabled the construction of a new generation of tools that promise to make formal verification more accessible for software engineers, by automating most if not all of the verification process. The Dafny system is a prominent example of that trend. However, little evidence exists yet about its accessibility. To help fill this gap, we conducted a set of 10 case studies of developing verified implementations in Dafny of some real-world algorithms and data structures, to determine its accessibility for software engineers. We found that, on average, the amount of code written for specification and verification purposes is of the same order of magnitude as the traditional code written for implementation and testing purposes (ratio of 1.14) – an “overhead” that certainly pays off for high-integrity software. The performance of the Dafny verifier was impressive, with 2.4 proof obligations generated per line of code written, and 24 ms spent per proof obligation generated and verified, on average. However, we also found that the manual work needed in writing auxiliary verification code may be significant and difficult to predict and master. Hence, further automation and systematization of verification tasks are possible directions for future advances in the field. © 2023, IFIP International Federation for Information Processing.

  • 24
  • 598