2021
Autores
da Costa, TS; Andrade, MT; Viana, P;
Publicação
PROCEEDINGS OF THE 2021 INTERNATIONAL WORKSHOP ON IMMERSIVE MIXED AND VIRTUAL ENVIRONMENT SYSTEMS (MMVE '21)
Abstract
Multi-view has the potential to offer immersive viewing experiences to users, as an alternative to 360 degrees and Virtual Reality (VR) applications. In multi-view, a limited number of camera views are sent to the client and missing views are synthesised locally. Given the substantial complexity associated to view synthesis, considerable attention has been given to optimise the trade-off between bandwidth gains and computing resources, targeting smooth navigation and viewing quality. A still relatively unexplored field is the optimisation of the way navigation interactivity is achieved, i.e. how the user indicates to the system the selection of new viewpoints. In this article, we introduce SmoothMV, a multi-view system that uses a non-intrusive head tracking approach to enhance navigation and Quality of Experience (QoE) of the viewer. It relies on a novel Hot&Cold matrix concept to translate head positioning data into viewing angle selections. Streaming of selected views is done using MPEG-DASH, where a proposed extension to the standard descriptors enables to achieve consistent and flexible view identification.
2020
Autores
Costa, TS; Andrade, MT; Viana, P;
Publicação
Intelligent Systems Design and Applications - 20th International Conference on Intelligent Systems Design and Applications (ISDA 2020) held December 12-15, 2020
Abstract
2022
Autores
Viana, P; Andrade, MT; Carvalho, P; Vilaca, L; Teixeira, IN; Costa, T; Jonker, P;
Publicação
JOURNAL OF IMAGING
Abstract
Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.
2009
Autores
Costa, T; Sampaio, A; Alves, G;
Publicação
2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 4, PROCEEDINGS
Abstract
System of systems involves several secondary systems working together with its creation gathering the knowledge of several distinct disciplines and teams, each one with their own background and methods, leading to a difficult communication between them. SysML, a language originated from UML, enables that communication, without background interference, with the use of a rich notation for systems design. This paper analyzes its use through the experience gained in the design of a chemical system with SysML.
2023
Autores
da Costa, TS; Andrade, MT; Viana, P; Silva, NC;
Publicação
PROCEEDINGS OF THE 2023 PROCEEDINGS OF THE 14TH ACM MULTIMEDIA SYSTEMS CONFERENCE, MMSYS 2023
Abstract
Immersive video applications impose unpractical bandwidth requirements for best-effort networks. With Multi-View(MV) streaming, these can be minimized by resorting to view prediction techniques. SmoothMV is a multi-view system that uses a non-intrusive head tracking mechanism to detect the viewer's interest and select appropriate views. By coupling Neural Networks (NNs) to anticipate the viewer's interest, a reduction of view-switching latency is likely to be obtained. The objective of this paper is twofold: 1) Present a solution for acquisition of gaze data from users when viewing MV content; 2) Describe a dataset, collected with a large-scale testbed, capable of being used to train NNs to predict the user's viewing interest. Tracking data from head movements was obtained from 45 participants using an Intel Realsense F200 camera, with 7 video playlists, each being viewed a minimum of 17 times. This dataset is publicly available to the research community and constitutes an important contribution to reducing the current scarcity of such data. Tools to obtain saliency/heat maps and generate complementary plots are also provided as an open-source software package.
2023
Autores
Costa, TS; Viana, P; Andrade, MT;
Publicação
IEEE ACCESS
Abstract
Quality of Experience (QoE) in multi-view streaming systems is known to be severely affected by the latency associated with view-switching procedures. Anticipating the navigation intentions of the viewer on the multi-view scene could provide the means to greatly reduce such latency. The research work presented in this article builds on this premise by proposing a new predictive view-selection mechanism. A VGG16-inspired Convolutional Neural Network (CNN) is used to identify the viewer's focus of attention and determine which views would be most suited to be presented in the brief term, i.e., the near-term viewing intentions. This way, those views can be locally buffered before they are actually needed. To this aim, two datasets were used to evaluate the prediction performance and impact on latency, in particular when compared to the solution implemented in the previous version of our multi-view streaming system. Results obtained with this work translate into a generalized improvement in perceived QoE. A significant reduction in latency during view-switching procedures was effectively achieved. Moreover, results also demonstrated that the prediction of the user's visual interest was achieved with a high level of accuracy. An experimental platform was also established on which future predictive models can be integrated and compared with previously implemented models.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.