2021
Authors
Carneiro, D; Oliveira, F; Novais, P;
Publication
Ambient Intelligence - Software and Applications - 12th International Symposium on Ambient Intelligence, ISAmI 2021, Salamanca, Spain, 6-8 October, 2021.
Abstract
Machine Learning problems are significantly growing in complexity, either due to an increase in the volume of data, to new forms of data, or due to the change of data over time. This poses new challenges that are both technical and scientific. In this paper we propose a Distributed Learning System that runs on top of a Hadoop cluster, leveraging its native functionalities. It is guided by the principle of data locality. Data are distributed across the cluster, so models are also distributed and trained in parallel. Models are thus seen as Ensembles of base models, and predictions are made by combining the predictions of the base models. Moreover, models are replicated and distributed across the cluster, so that multiple nodes can answer requests. This results in a system that is both resilient and with high availability. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2021
Authors
Monteiro, JP; Ramos, D; Carneiro, D; Duarte, F; Fernandes, JM; Novais, P;
Publication
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
Abstract
In the last years, organizations and companies in general have found the true potential value of collecting and using data for supporting decision-making. As a consequence, data are being collected at an unprecedented rate. This poses several challenges, including, for example, regarding the storage and processing of these data. Machine Learning (ML) is also not an exception, in the sense that algorithms must now deal with novel challenges, such as learn from streaming data or deal with concept drift. ML engineers also have a harder task when it comes to selecting the most appropriate model, given the wealth of algorithms and possible configurations that exist nowadays. At the same time, training time is a stronger restriction as the computational complexity of the training model increases. In this paper we propose a framework for dealing with these challenges, based on meta-learning. Specifically, we tackle two well-defined problems: automatic algorithm selection and continuous algorithm updates that do not require the retraining of the whole algorithm to adapt to new data. Results show that the proposed framework can contribute to ameliorate the identified issues.
2021
Authors
Rocha, R; Carneiro, D; Novais, P;
Publication
NEUROCOMPUTING
Abstract
Traditional explicit authentication mechanisms, in which the device remains unlocked after the introduction of some kind of password, are slowly being complemented with the so-called implicit or continuous authentication mechanisms. In the latter, the user is constantly monitored in one or more ways, in search for signs of unauthorized access, which may happen if a third party has access to the phone after it has been unlocked. There are some different forms of continuous authentication, some of which based on Machine Learning. These are generally black box models, that provide a decision but not an explanation. In this paper we propose an approach for continuous authentication based on behavioral biometrics, machine learning, and that includes domain-dependent aspects for the user to interpret the actions and decisions of the system. It is non-intrusive, does not require any additional hardware, and can be used continuously to monitor user identity.
2021
Authors
Carneiro, D; Veloso, P; Guimarães, M; Baptista, J; Sousa, M;
Publication
Proceedings of 4th International Workshop on eXplainable and Responsible AI and Law co-located with 18th International Conference on Artificial Intelligence and Law (ICAIL 2021), Virtual Event, Sao Paolo, Brazil, June 21, 2021.
Abstract
2021
Authors
Guimaraes, M; Carneiro, D;
Publication
PROCEEDINGS OF 2021 16TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2021)
Abstract
Machine Learning is one of the most trending topics nowadays. The reason is of course for being more and more present in our everyday life, even if we do not notice it. What goes even more unnoticed is the fact that every Machine Learning model needs computational power. And of course, it also needs data. But how many data are necessary to build the best Machine Learning model possible, and how many times do we need to retrain a model so that it does not become obsolete as data change? That kind of questions are the ones that can reduce unnecessary costs to a company. In this paper we propose a novel approach to predict the performance of a model given some characteristics of the data, that are called meta-features. The goal is, indeed, to only train a new model when some error metric (e.g., RMSE) is expected to decrease substantially compared with a previously trained model. This approach is best applied in scenarios of data streaming or in Big Data, as well on Interactive Machine Learning scenarios. We validate it on a real Fraud Detection case and this scenario is also briefly described.
2021
Authors
Carneiro, D; Pereira, J; Silva, ECE;
Publication
NEURAL COMPUTING & APPLICATIONS
Abstract
Grapes reception is a key process in wine production. The harvest days are extremely challenging days in managing the reception of the grapes, as the winery needs to deal with the non-uniform arrival of the grapes, while guaranteeing suppliers' satisfaction and wine quality. The best management of the resources of the suppliers (i.e., grapes and trucks) and winery (i.e., grain-tanks and pressing machines) must be ensured. In this paper, the underlying optimization problem for grape reception is solved by developing a genetic algorithm (GA) tailored for this specific challenge. The results of this algorithm are compared with a FIFO policy for a typical scenario that occurs on the harvest days of a real winery. Additionally, different scenarios are simulated to assess the validity and quality of the solutions found. The results show that, using modest computational resources, it is possible to achieve better solutions with the proposed GA. This allows for the algorithm to be used in real time, even whenever plant conditions change significantly (e.g., when a new truck arrives, when a machine fails). Furthermore, the trucks and grapes waiting time for the results using the developed GA are significantly smaller than the ones observed using a FIFO approach.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.