Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Facts & Numbers
000
Presentation

Telecommunications and Multimedia

At CTM, our vision is to promote a lively and sustainable world where networked intelligence enables ubiquitous interaction with sensory-rich content. Our mission is to develop advanced systems and technologies to enable high capacity, efficient, and secure communications, media knowledge extraction, and immersive ubiquitous multimedia applications.

We work in 4 main areas of research: Optical and Electronic Technologies, Wireless Networks, Multimedia and Communications Technologies, and VCMI (Visual Computing and Machine Intelligence).

Latest News

INESC TEC with five FCT exploratory projects approved in four R&D areas

Telecommunications and Multimedia, Applied Photonics, High-assurance Software and Advanced Computing Systems – these are the four domains that INESC TEC researchers will explore within the scope of the five projects that were approved through the Call for Exploratory Projects promoted by the Foundation for Science and Technology (FCT).

02nd October 2024

Artificial Intelligence

Já arrancou o primeiro projeto europeu liderado pelo INESC TEC na área da saúde

Chama-se AI4Lungs e tem como objetivo desenvolver ferramentas e modelos computacionais baseados em Inteligência Artificial para otimizar o diagnóstico e o tratamento de doenças pulmonares. Através de uma abordagem holística e multimodal, os investigadores vão criar uma solução de cuidados de saúde personalizados para doenças respiratórias. No final de fevereiro, representantes das 18 entidades parceiras do projeto, provenientes de 10 países, reuniram-se no INESC TEC para assinalar o arranque do AI4Lungs.

01st April 2024

Communications

Europe discusses collaboration opportunities in high-frequency wireless communications

Smart propagation environments, improvements in signal processing for the sixth generation of mobile communications, and 6G-centred network and location developments were some of the topics discussed at an event organised by the European projects TERRAMETA (coordinated by INESC TEC), 6G-SHINE and TIMES, in collaboration with RESTART-IN – an Italian PRR.

06th March 2024

Artificial Intelligence

INESC TEC researchers work on the first prototype that applies AI to colorectal diagnosis developed in Portugal

The work behind the first prototype that uses Artificial Intelligence (AI) for colorectal diagnosis was fully developed by Portuguese researchers INESC TEC, and the IMP Diagnostics Molecular & Anatomic Pathology laboratory; the work featured in the renowned international scientific journal npj Precision Oncology (https://www.nature.com/articles/s41698-024-00539-4 ).

05th March 2024

INESC TEC researchers led discussion on wireless communications and computer vision at GLOBECOM

After almost one year, the CONVERGE project (coordinated by INESC TEC) has already showed relevant outcomes at one of the main conferences of the IEEE Communications Society, the GLOBECOM (Malaysia) – namely, through the organisation of a panel. “Convergence of wireless communications and computer vision: a new paradigm created by the CONVERGE project” sought to discuss the new opportunities and potential challenges associated with the use of tools that combine radio with computer vision.

23rd January 2024

001

Featured Projects

PFAI4_5eD

Programa de Formação Avançada Industria 4 - 5a edição

2024-2024

Team
002

Laboratories

Laboratory of Sound and Music Computing

Optical and Electronic Technologies Research Laboratory

Publications

CTM Publications

View all Publications

2025

A survey on cell nuclei instance segmentation and classification: Leveraging context and attention

Authors
Nunes, JD; Montezuma, D; Oliveira, D; Pereira, T; Cardoso, JS;

Publication
MEDICAL IMAGE ANALYSIS

Abstract
Nuclear-derived morphological features and biomarkers provide relevant insights regarding the tumour microenvironment, while also allowing diagnosis and prognosis in specific cancer types. However, manually annotating nuclei from the gigapixel Haematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features for artificial intelligence (AI) tools. But due to high intra- and inter-class variability of nuclei morphological and chromatic features, as well as H&Estains susceptibility to artefacts, state-of-the-art algorithms cannot correctly detect and classify instances with the necessary performance. In this work, we hypothesize context and attention inductive biases in artificial neural networks (ANNs) could increase the performance and generalization of algorithms for cell nuclei instance segmentation and classification. To understand the advantages, use-cases, and limitations of context and attention-based mechanisms in instance segmentation and classification, we start by reviewing works in computer vision and medical imaging. We then conduct a thorough survey on context and attention methods for cell nuclei instance segmentation and classification from H&E-stained microscopy imaging, while providing a comprehensive discussion of the challenges being tackled with context and attention. Besides, we illustrate some limitations of current approaches and present ideas for future research. As a case study, we extend both a general (Mask-RCNN) and a customized (HoVer-Net) instance segmentation and classification methods with context- and attention-based mechanisms and perform a comparative analysis on a multicentre dataset for colon nuclei identification and counting. Although pathologists rely on context at multiple levels while paying attention to specific Regions of Interest (RoIs) when analysing and annotating WSIs, our findings suggest translating that domain knowledge into algorithm design is no trivial task, but to fully exploit these mechanisms in ANNs, the scientific understanding of these methods should first be addressed.

2025

CNN explanation methods for ordinal regression tasks

Authors
Gómez, JB; Cruz, RPM; Cardoso, JS; Gutiérrez, PA; Martínez, CH;

Publication
Neurocomputing

Abstract

2025

Causal representation learning through higher-level information extraction

Authors
Silva, F; Oliveira, HP; Pereira, T;

Publication
ACM COMPUTING SURVEYS

Abstract
The large gap between the generalization level of state-of-the-art machine learning and human learning systems calls for the development of artificial intelligence (AI) models that are truly inspired by human cognition. In tasks related to image analysis, searching for pixel-level regularities has reached a power of information extraction still far from what humans capture with image-based observations. This leads to poor generalization when even small shifts occur at the level of the observations. We explore a perspective on this problem that is directed to learning the generative process with causality-related foundations, using models capable of combining symbolic manipulation, probabilistic reasoning, and pattern recognition abilities. We briefly review and explore connections of research from machine learning, cognitive science, and related fields of human behavior to support our perspective for the direction to more robust and human-like artificial learning systems.

2025

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Authors
Sales Mendes, A; Lozano Murciego, Á; Silva, LA; Jiménez Bravo, M; Navarro Cáceres, M; Bernardes, G;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2025

Model compression techniques in biometrics applications: A survey

Authors
Caldeira, E; Neto, PC; Huber, M; Damer, N; Sequeira, AF;

Publication
INFORMATION FUSION

Abstract
The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. These compressed models are especially essential when implementing multi-model fusion solutions where multiple models are required to operate simultaneously. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.

Facts & Figures

15Academic Staff

2020

82Researchers

2016

11Proceedings in indexed conferences

2020

Contacts