2023
Authors
Jatowt, A; Sato, M; Draxl, S; Duan, YJ; Campos, R; Yoshikawa, M;
Publication
INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES
Abstract
Our civilization creates enormous volumes of digital data, a substantial fraction of which is preserved and made publicly available for present and future usage. Additionally, historical born-analog records are progressively being digitized and incorporated into digital document repositories. While professionals often have a clear idea of what they are looking for in document archives, average users are likely to have no precise search needs when accessing available archives (e.g., through their online interfaces). Thus, if the results are to be relevant and appealing to average people, they should include engaging and recognizable material. However, state-of-the-art document archival retrieval systems essentially use the same approaches as search engines for synchronic document collections. In this article, we develop unique ranking criteria for assessing the usefulness of archived contents based on their estimated relationship with current times, which we call contemporary relevance. Contemporary relevance may be utilized to enhance access to archival document collections, increasing the likelihood that users will discover interesting or valuable material. We next present an effective strategy for estimating contemporary relevance degrees of news articles by utilizing learning to rank approach based on a variety of diverse features, and we then successfully test it on the New York Times news collection. The incorporation of the contemporary relevance computation into archival retrieval systems should enable a new search style in which search results are meant to relate to the context of searchers' times, and by this have the potential to engage the archive users. As a proof of concept, we develop and demonstrate a working prototype of a simplified ranking model that operates on the top of the Portuguese Web Archive portal (arquivo.pt).
2023
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M;
Publication
CEUR Workshop Proceedings
Abstract
[No abstract available]
2023
Authors
Mansouri, B; Campos, R; Jatowt, A;
Publication
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023
Abstract
Timeline summarization (TLS) is a challenging research task that requires researchers to distill extensive and intricate temporal data into a concise and easily comprehensible representation. This paper proposes a novel approach to timeline summarization using Abstract Meaning Representations (AMRs), a graphical representation of the text where the nodes are semantic concepts and the edges denote relationships between concepts. With AMR, sentences with different wordings, but similar semantics, have similar representations. To make use of this feature for timeline summarization, a two-step sentence selection method that leverages features extracted from both AMRs and the text is proposed. First, AMRs are generated for each sentence. Sentences are then filtered out by removing those with no named-entities and keeping the ones with the highest number of named-entities. In the next step, sentences to appear in the timeline are selected based on two scores: Inverse Document Frequency (IDF) of AMR nodes combined with the score obtained by applying a keyword extraction method to the text. Our experimental results on the TLS-Covid19 test collection demonstrate the potential of the proposed approach.
2023
Authors
Oliveira, J; Carvalho, M; Nogueira, D; Coimbra, M;
Publication
INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH
Abstract
Physiological signals are often corrupted by noisy sources. Usually, artificial intelligence algorithms analyze the whole signal, regardless of its varying quality. Instead, experienced cardiologists search for a high-quality signal segment, where more accurate conclusions can be draw. We propose a methodology that simultaneously selects the optimal processing region of a physiological signal and determines its decoding into a state sequence of physiologically meaningful events. Our approach comprises two phases. First, the training of a neural network that then enables the estimation of the state probability distribution of a signal sample. Second, the use of the neural network output within an integer program. The latter models the problem of finding a time window by maximizing a likelihood function defined by the user. Our method was tested and validated in two types of signals, the phonocardiogram and the electrocardiogram. In phonocardiogram and electrocardiogram segmentation tasks, the system's sensitivity increased on average from 95.1% to 97.5% and from 78.9% to 83.8%, respectively, when compared to standard approaches found in the literature.
2023
Authors
Leal, F; Veloso, B; Malheiro, B; Burguillo, JC;
Publication
EXPERT SYSTEMS
Abstract
Crowdsourced data streams are popular and extremely valuable in several domains, namely in tourism. Tourism crowdsourcing platforms rely on past tourist and business inputs to provide tailored recommendations to current users in real time. The continuous, open, dynamic and non-curated nature of the crowd-originated data demands specific stream mining techniques to support online profiling, recommendation, change detection and adaptation, explanation and evaluation. The sought techniques must, not only, continuously improve and adapt profiles and models; but must also be transparent, overcome biases, prioritize preferences, master huge data volumes and all in real time. This article surveys the state-of-art of adaptive and explainable stream recommendation, extends the taxonomy of explainable recommendations from the offline to the stream-based scenario, and identifies future research opportunities.
2023
Authors
Kuk, M; Bobek, S; Veloso, B; Rajaoarisoa, LH; Nalepa, GJ;
Publication
Computational Science - ICCS 2023 - 23rd International Conference, Prague, Czech Republic, July 3-5, 2023, Proceedings, Part V
Abstract
In an industrial setting, predicting the remaining useful life-time of equipment and systems is crucial for ensuring efficient operation, reducing downtime, and prolonging the life of costly assets. There are state-of-the-art machine learning methods supporting this task. However, in this paper, we argue, that both efficiency and understandability can be improved by the use of explainable AI methods that analyze the importance of features used by the machine learning model. In the paper, we analyze the feature importance before a failure occurs to identify events in which an increase in importance can be observed and based on that indicate attributes with the most influence on the failure. We demonstrate how the analyses of Shap values near the occurrence of failures can help identify the specific features that led to the failure. This in turn can help in identifying the root cause of the problem and developing strategies to prevent future failures. Additionally, it can be used to identify areas where maintenance or replacement is needed to prevent failure and prolong the useful life of a system. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.