Dissertation defense (December 11, 2020): Daniel Ferreira de Oliveira

Student: Daniel Ferreira de Oliveira

Title: RioGraphX: a Portal to support research in Spectral Graph Theory

Advisors: Leonardo Silva de Lima (advisor), Eduardo Bezerra da Silva (co-advisor)

Committee: Leonardo Silva de Lima (president), Eduardo Bezerra da Silva (CEFET/RJ), Rafaelli de Carvalho Coutinho (CEFET/RJ), Virgínia Maria Rodrigues (UFGRS), Claudia Marcela Justel (IME)

Day/Time: December 11, 2020 / 16h

Room: https://teams.microsoft.com/l/team/19%3a3daf2ce8441f43b29ec83255c159ef85%40thread.tacv2/conversations?groupId=336cfc27-4004-429a-8da5-b135499e7cf9&tenantId=c37b37a3-e9e2-42f9-bc67-4b9b738e1df0

Abstract: The TEG is a part of discrete mathematics that studies the properties of a graph from the information provided by the eigenvalues and eigenvectors of the matrix associated with this graph. This theory has attracted a greater interest from researchers since the 1980s, due to its application in several areas, such as in Chemistry, Mathematics, Engineering and Computer Science. With the exponential growth in the volume of data currently available, processing information in parallel and distributed task execution environments is crucial for better productivity and performance. In order to build a WEB tool that eliminates the need for processing resources by the user, we propose RioGraphX. A scientific portal developed using Apache Spark, which aims to obtain all graphs that optimize a mathematical function involving invariants of a graph with possible restrictions. A workflow with seven steps was developed in order to obtain as many tasks as possible running in the Apache Spark distributed and parallel computing environment. As Spark provides API for Scala, Java and Python, two sources were developed in this study: one in the Java language and the other in Python due to the abundance of support libraries. Then, two tests were performed: one for validation and the other for performance. From the tests, calculations of speedup and Efficiency composing a comparative of execution of tasks in parallel and distributed processing environment with monoprocessed environment showed the superiority of the code developed in Java and the evaluation of these performance metrics demonstrate the importance of dynamic allocation of Spark resources taking into account the size of the database. The execution times of the Portal were satisfactory considering the volume of data processed.