Detalhes
Nome
André Manuel SequeiraCargo
Assistente de InvestigaçãoDesde
01 abril 2021
Nacionalidade
PortugalCentro
Laboratório de Software ConfiávelContactos
+351253604440
andre.m.sequeira@inesctec.pt
2024
Autores
Coelho, R; Sequeira, A; Santos, LP;
Publicação
QUANTUM MACHINE INTELLIGENCE
Abstract
Reinforcement learning (RL) consists of designing agents that make intelligent decisions without human supervision. When used alongside function approximators such as Neural Networks (NNs), RL is capable of solving extremely complex problems. Deep Q-Learning, a RL algorithm that uses Deep NNs, has been shown to achieve super-human performance in game-related tasks. Nonetheless, it is also possible to use Variational Quantum Circuits (VQCs) as function approximators in RL algorithms. This work empirically studies the performance and trainability of such VQC-based Deep Q-Learning models in classic control benchmark environments. More specifically, we research how data re-uploading affects both these metrics. We show that the magnitude and the variance of the model's gradients remain substantial throughout training even as the number of qubits increases. In fact, both increase considerably in the training's early stages, when the agent needs to learn the most. They decrease later in the training, when the agent should have done most of the learning and started converging to a policy. Thus, even if the probability of being initialized in a Barren Plateau increases exponentially with system size for Hardware-Efficient ansatzes, these results indicate that the VQC-based Deep Q-Learning models may still be able to find large gradients throughout training, allowing for learning.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.