Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Inês Koch

2019

Knowledge Graph Implementation of Archival Descriptions Through CIDOC-CRM

Authors
Koch, I; Freitas, N; Ribeiro, C; Lopes, CT; da Silva, JR;

Publication
DIGITAL LIBRARIES FOR OPEN KNOWLEDGE, TPDL 2019

Abstract
Archives have well-established description standards, namely the ISAD(G) and ISAAR(CPF) with a hierarchical structure adapted to the nature of archival assets. However, as archives connect to a growing diversity of data, they aim to make their representations more apt to the so-called linked data cloud. The corresponding move from hierarchical, ISAD-conforming descriptions to graph counterparts requires state-of-the-art technologies, data models and vocabularies. Our approach addresses this problem from two perspectives. The first concerns the data model and description vocabularies, as we adopt and build upon the CIDOC-CRM standard. The second is the choice of technologies to support a knowledge graph, including a graph database and an Object Graph Mapping library. The case study is the Portuguese National Archives, Torre do Tombo, and the overall goal is to build a CIDOC-CRM-compliant system for document description and retrieval, to be used by professionals and the public. The early stages described here include the design of the core data model for archival records represented as the ArchOnto ontology and its embodiment in the ArchGraph knowledge graph. The goal of a semantic archival information system will be pursued in the migration of existing records to the richer representation and the development of applications supported on the graph.

2020

ArchOnto, a CIDOC-CRM-Based Linked Data Model for the Portuguese Archives

Authors
Koch, I; Ribeiro, C; Lopes, CT;

Publication
Digital Libraries for Open Knowledge - 24th International Conference on Theory and Practice of Digital Libraries, TPDL 2020, Lyon, France, August 25-27, 2020, Proceedings

Abstract
Archives are faced with great challenges due to the vast amounts of data they have to curate. New data models are required, and work is underway. The International Council on Archives is creating the RiC-CM (Records in Context), and there is a long line of work in museums with the CIDOC-CRM (CIDOC Conceptual Reference Model). Both models are based on ontologies to represent cultural heritage data and link them to other information. The Portuguese National Archives hold a collection with over 3.5 million metadata records, described with the ISAD(G) standard. The archives are designing a new linked data model and a technological platform with applications for archive contributors, archivists, and the public. The current work extends CIDOC-CRM into ArchOnto, an ontology-based model for archives. The model defines the relevant archival entities and properties and will be used to migrate existing records. ArchOnto accommodates the existing ISAD(G) information and takes into account its implementation with current technologies. The model is evaluated with records from representative fonds. After the test on these samples, the model is ready to be populated with the semi-automatic transformation of the ISAD records. The evaluation of the model and the population strategies will proceed with experiments involving professional and lay users. © 2020, Springer Nature Switzerland AG.

2020

Knowledge Discovery from ISAD, Digital Archive Data, into ArchOnto, a CIDOC-CRM based Linked Model

Authors
Melo, D; Rodrigues, IP; Koch, I;

Publication
Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2020, Volume 2: KEOD, Budapest, Hungary, November 2-4, 2020.

Abstract
This paper presents an automatic semantic migration prototype based on Knowledge Discovery from Digital Archive Data for ontology population in the domain of Archives metadata, ISAD(G). Natural Language Processing (NLP) techniques are used for language processing and Semantic Web techniques for querying and updating the Ontology ArchOnto, a CIDOC-CRM (Conceptual Reference Model) extension. This work is done in the context of project EPISA (Entity and Property Inference for Semantic Archives) where the Portuguese National Archives, Torre do Tombo (ANTT) is one of the partners. The data model and description vocabularies we adopted are built upon the CIDOC-CRM standard, an ontology, developed for museums by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM). A detailed example of a baptism document metadata migration is presented to highlight the challenges on the natural language interpretation and the ontology representation. Copyright

2022

Integration of models for linked data in cultural heritage and contributions to the FAIR principles

Authors
Koch, I;

Publication
2022 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL)

Abstract
Incorporating linked data-based models into the process of describing cultural objects is increasingly important for cultural heritage. Communities such as libraries, archives, and museums have developed and adopted models specific to their contexts. Without a trivial solution, choosing models to support more general applications is challenging. This Ph.D. aims to analyze existing solutions and practices in these domains and propose validated solutions for the discovery, access, interoperability, and reuse of cultural objects, following the FAIR principles. Transversal to the base models used, this research intends to adopt solutions that balance the simplicity of the models with the satisfaction of the requirements.

2023

From ISAD(G) to Linked Data Archival Descriptions

Authors
Koch, I; Pires, C; Lopes, CT; Ribeiro, C; Nunes, S;

Publication
LINKING THEORY AND PRACTICE OF DIGITAL LIBRARIES, TPDL 2023

Abstract
Archives preserve materials that allow us to understand and interpret the past and think about the future. With the evolution of the information society, archives must take advantage of technological innovations and adapt to changes in the kind and volume of the information created. Semantic Web representations are appropriate for structuring archival data and linking them to external sources, allowing versatile access by multiple applications. ArchOnto is a new Linked Data Model based on CIDOC CRM to describe archival objects. ArchOnto combines specific aspects of archiving with the CIDOC CRM standard. In this work, we analyze the ArchOnto representation of a set of archival records from the Portuguese National Archives and compare it to their CIDOC CRM representation. As a result of ArchOnto's representation, we observe an increase in the number of classes used, from 20 in CIDOC CRM to 28 in ArchOnto, and in the number of properties, from 25 in CIDOC CRM to 28 in ArchOnto. This growth stems from the refinement of object types and their relationships, favouring the use of controlled vocabularies. ArchOnto provides higher readability for the information in archival records, keeping it in line with current standards.

2023

Moving from ISAD(G) to a CIDOC CRM-based Linked Data Model in the Portuguese Archives

Authors
Koch, I; Lopes, CT; Ribeiro, C;

Publication
ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE

Abstract
Archives are facing numerous challenges. On the one hand, archival assets are evolving to encompass digitized documents and increasing quantities of born-digital information in diverse formats. On the other hand, the audience is changing along with how it wishes to access archival material. Moreover, the interoperability requirements of cultural heritage repositories are growing. In this context, the Portuguese Archives started an ambitious program aiming to evolve its data model, migrate existing records, and build a new archival management system appropriate to both archival tasks and public access. The overall goal is to have a fine-grained and flexible description, more machine-actionable than the current one. This work describes ArchOnto, a linked open data model for archives, and rules for its automatic population from existing records. ArchOnto adopts a semantic web approach and encompasses the CIDOC Conceptual Reference Model and additional ontologies, envisioning interoperability with datasets curated by multiple communities of practice. Existing ISAD(G)-conforming descriptions are being migrated to the new model using the direct mappings provided here. We used a sample of 25 records associated with different description levels to validate the completeness and conformity of ArchOnto to existing data. This work is in progress and is original in several respects: (1) it is one of the first approaches to use CIDOC CRM in the context of archives, identifying problems and questions that emerged during the process and pinpointing possible solutions; (2) it addresses the balance in the model between the migration of existing records and the construction of new ones by archive professionals; and (3) it adopts an open world view on linking archival data to global information sources.

  • 1
  • 2