2023
Authors
de Jesus, G;
Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III
Abstract
Tetun is one of Timor-Leste's official languages alongside Portuguese. It is a low-resource language with over 932,000 speakers that started developing when Timor-Leste restored its independence in 2002. Newspapers mainly use Tetun and more than ten national online news websites actively broadcast news in Tetun every day. However, since information retrieval-based solutions for Tetun do not exist, finding Tetun information on the internet and digital platforms is challenging. This work aims to investigate and develop solutions that can enable the application of information retrieval techniques to develop search solutions for Tetun using Tetun INL and focus on the ad-hoc text retrieval task. As a result, we expect to have effective search solutions for Tetun and contribute to the innovation in information retrieval for low-resource languages, including making Tetun datasets available for future researchers.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.