Defesa de Dissertação (28/02/2019): Ramon Ferreira Silva

Discente: Ramon Ferreira Silva

Título: Refinement of response models to binary questions

Orientadores: Eduardo Bezerra da Silva (orientador), Joel André Ferreira dos Santos (co-orientador)

Banca: Eduardo Bezerra da Silva (Cefet/RJ) (Presidente), Joel André Ferreira dos Santos (CEFET/RJ), Kele Teixeira Belloze (Cefet/RJ), Ronaldo Ribeiro Goldschmidt (Name-RJ)

Dia/Hora: February 28/9h

Sala: Auditorium V


Responses to visual questions (visual Question Answering, RPV) is a task that unites The fields of computer vision and natural language processing (Natural Language Processing, PLN). Taking as inputs an image I and a question in natural language Q about I, a model for RPV should be able to produce a response R (also in natural language) to Q in a coherent way.  A particular type of visual query is That No Which question is binary (i.e., a question whose answer belongs to the set {Yes, not}). Currently, deep neural networks are the technique that corresponds to state of the art for the training of RPV models. Despite its success, the application of neural networks to the RPV task requires a very large amount of data to be able to produce models with adequate accuracy.  The data sets currently used for the training of RPV models are the result of laborious processes of manual labeling (i.e., made by humans).  This context makes it relevant to study approaches to take greater advantage of these datasets during training. This dissertation proposes to investigate approaches to improve the accuracy of the RPV models for binary questions.  In particular, we present reasoned approaches in active learning techniques (active learning) and increased data (data Increase) to take greater advantage of the existing data set during the training phase of an RPV model.